PEEY.USHPCHANDRA 


NOTES 
ON 
EMATHEMATLES 


Notes on Mathematics - 102! 


Peeyush Chandra, A. K. Lal, V. Raghavendra, G. Santhanam 


‘Supported by a grant from MHRD 


Contents 


I Linear Algebra 


1.1 Definition of a Matrix} 
1.1.1 Special Matrice 


1.2__ Operations on Matrices 


12:1 


Multiplication of Matrices 


1.3.1 Submatrix of a Matrix 
1.3.1 Block Matrice 


1.4 Matrices over Complex Numbers 
2 _Linear System of Equation: 


2.2 Definition and_a Solution Method 


2.7.1 


2.7.3 Inverse and Gauss-Jordan Method 


2.8.1 Adjoint of a Matrix 
2.8.2 Cramer’s Rule 


2.9 Miscellaneous Exercise 


3 Finite Dimensional Vector Space 
3.1 Vector Spaces 
3.1.1 Definitio 


3.1.2 _ Examples 


1.3 Some More Special Matricey......... 


2.1 Introduction). .......0.0.0.0.0...00.0. 


2.2.1 A Solution Method.......... 


2.4.2 Elementary Matricey......... 


2.7__Invertible Matricegy .. 2... ...0.000020. 
Inverse of a Matrix}... .....20. 


2.7.2 Equivalent conditions for Invertibility| 


2.8 Determinant]l................0. 


4 CONTENTS 


3.1.4 Linear Combinations 


ott A RASC 5 say wh sor stashed tue th wb Sy WS, DADS TAE “a, Uae Geechvcade: dex aie Sag Gap das dos tt Bh ke, du chs chs che d Gt a ig Wo oot a HE es 58 
3:3. 1. Important. Result@) oc... 3 dels So Os eek ee aa a a MA OS 60 


3.4 Ordered Based... ww ee 66 


4 Linear Transformation: 69 
4.1 Definitions and Basic Propertie 


a throwers © Gig: beacuse leech ch ae de Ech ee ee ee 69 

4.2 Matrix of a linear transformation). ....... 0... 72 
4.3 _Rank-Nullity Theorem]... 2. 2 75 
4.4 Similarity of Matriceég . 6. 6 ie a ee ek ee ee ee ee 80 

5 Inner Product Space 87 


5.1 Definition and Basic Propertiey .. 2... 87 
5.2 —Gram-Schmidt Orthogonalisation Process 
5.3 Orthogonal Projections and Application... 2... 0.0. ee 100 


5.3.1 Matrix of the Orthogonal Projectio 


- a a ae 
107 
Sista dash ods Be dh stu ty Se eh BA Store Ge Se. ds Hash eR A ST Pet 113 
ie a ee Ae ads nh ek A a 116 


II Ordinary Differential Equatio 129 


7_ Differential Equation 131 
7.1 Introduction and Preliminariegy ... 2... 131 


wots ee be vie a ed she he te de Be Gah ct Bd sa, ok, ir vhs he Gene, Gee de dds be ead 134 

7.2.1 Equations Reducible to Separable Form] ................2.0 000088 134 

7.3 Exact Equationg .. 0... ee 136 
7.3.1 Integrating Factorg. .. 2... 138 


A Winear HiGtationg|: «4 e888 oe 3 a ls A Os ee ee eh eR 141 
7.5 Miscellaneous Remarks 
7.6 Initial Value Problems ... 2... 2 145 


7.6.1 Orthogonal Trajectories ... 2... 149 
7.7 Numerical Methodg. .. 2... 2. ee 150 


8 Second Order and Higher Order Equation 153 
iL: = Lt ROGUMCTION sai cain Sh ahi sac betteay aie oh Oh at eae Wee Re eee hae at a A ee eh e Sa 153 
8.2 More on Second Order Equationg. .. 2... 0... ee 156 


S221 Wronskianl. vn. a ee ee ee ee ee ee oe a eS 156 


8.2.2 Method of Reduction of Ordey ... 2... 2. ee 159 


8.3 Second Order equations with Constant Coefficients ............... 0.00004 160 
8.4 Non Homogeneous Equations ... 2... 162 


8.5 Variation of Parameters .. 2... 


CONTENTS 


8.7 Method of Undetermined Coefficientg. .............. 


9 Solutions Based_on Power Serie 


9.1.1 Properties of Power Serie 


9.3 Statement of Frobenius Theorem for Regular (Ordinary) Poin 


9.4 Legendre Equations and Legendre Polynomials 


9.4.1 


Introductio 


9.4.2 Legendre Polynomialgy ................... 


III Laplace 'Transfor 
10 Laplace Transfor 


10.1 Introduction)... 2... 2. 


10.2 Definitions and Examples 


10.3.1 Inverse Transforms of Rational Functions 
10.3.2 Transform of Unit Step Functio 


10.6 Transform of the Unit-Impulse Functio 


TV Numerical Application 


11 Newton’s Interpolation Formulae 
11.1 Introductio 


11.2.1 Forward Difference Operato 
11.2.2 Backward Difference Operato 
11.2.3 Central Difference Operato 


11.3 Relations between Difference operators 
11.4 Newton’s Interpolation Formulae 


12 Lagrange’s Interpolation Formula 


12.3 Lagrange’s Interpolation formula 


12.4 Gauss’s and Stirling’s Formulas 


9.1 Introduction... ....0.00002020000.0.00.000000 0004 


9.2 Solutions in terms of Power Series} ..........0....0.0. 


10:2.) -Mxampleg saci. Siac God de AOA a kw Anke Ge ch a 


10.3 Properties of Laplace Transform] ................. 


10.4 Some Useful Results .. 2.2... ee 
10.4.1 Limiting Theorems .........0....0.....004 
10.5 Application to Differential Equationg............... 


11.2 Difference Operatoy ... 2... 2... 


11.2.4 Shift Operatoy .......0.0.0.0.02....0.0.00.0004 


11.2.5 Averaging Operatoy] .. 2... 


12.1 Introduction)... 2... 
12.2 Divided Differences... . 2... 


189 


191 
te A Bo che te & Beg etna be 191 


6 CONTENTS 


13.3.1 A General Quadrature Formula)... .. 2.0.0.0... 0000000 ee ee ee 233 
13,3.2"Trapezoidal Rule) isis. eg. asec alee ac ae ale we ee BW Ge SW A a a a RS 234 
13.3.3 Simpson’s Ruld .. . 0... ee 235 

; 239 

14.1 Svstem of Linear Equationg .... 2... 239 


be eek ees A 5 gi ee Be bas a A bok eh A 242 
Feet ate eeu, ee es re oe es 250 


Part I 


Linear Algebra 


Chapter 1 


Matrices 


1.1 Definition of a Matrix 


Definition 1.1.1 (Matrix) A rectangular array of numbers is called a matrix. 


We shall mostly be concerned with matrices having real numbers as entries. 
The horizontal arrays of a matrix are called its ROWS and the vertical arrays are called its COLUMNS. 
A matrix having m rows and n columns is said to have the order m x n. 


A matrix A of ORDER m x n can be represented in the following form: 


Q11 a12 Qin 

a21 a22 a2n 
A= ; 

Am1 aAm2 piles Amn 


h th 


where a,; is the entry at the intersection of the i) row and gj column. 


In a more concise manner, we also denote the matrix A by [a;;] by suppressing its order. 


G11 412°" Gn 
@21 G22 ""* G2n ‘ 
Remark 1.1.2 Some books also use . ; ; 3 to represent a matrix. 
QGm1 GAm2 °"° Amn 
3.7 
Let A= 5 6 . Then Q41 1, 42 3, Q413 7, a21 4, 422 5, and a23 = 6. 


A matrix having only one column is called a COLUMN VECTOR; and a matrix with only one row is 
called a ROW VECTOR. 

WHENEVER A VECTOR IS USED, IT SHOULD BE UNDERSTOOD FROM THE CONTEXT WHETHER IT IS 
A ROW VECTOR OR A COLUMN VECTOR. 


Definition 1.1.3 (Equality of two Matrices) Two matrices A = [a;,;] and B = [b;;] having the same order 
m xX n are equal if aj; = b;; for each i = 1,2,...,mand j = 1,2,...,n. 


In other words, two matrices are said to be equal if they have the same order and their corresponding 
entries are equal. 
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Example 1.1.4 The linear system of equations 2% + 3y = 5 and 3x + 2y = 5 can be identified with the 


: 5 
matrix 
3°20 yb 


1.1.1 Special Matrices 


Definition 1.1.5 1. A matrix in which each entry is zero is called a zero-matrix, denoted by 0. For 


0 0 0 0 0 
(@) — d 0 = . 
2x2 | J an 2x3 f 0 ; 


2. A matrix having the number of rows equal to the number of columns is called a square matrix. Thus, 


example, 


its order is m x m (for some m) and is represented by m only. 


3. In a square matrix, A = [aj;,], of order n, the entries a11,@22,..-,@nn are called the diagonal entries 
and form the principal diagonal of A. 

4. A square matrix A = [a,j] is said to be a diagonal matrix if a,j = 0 for i € 7. In other words, the 

. bie : . 0 

non-zero entries appear only on the principal diagonal. For example, the zero matrix 0,, and i 4 
are a few diagonal matrices. 
A diagonal matrix D of order n with the diagonal entries d,,d2,...,d,, is denoted by D = diag(d,,...,dn). 
If d; = d for all i =1,2,...,m then the diagonal matrix D is called a scalar matrix. 

1 if t=3 


O if i#g 
is called the identity matrix, denoted by I,,. 


5. A square matrix A = [a,j] with aj; = 


1 0 
For example, [2 = ; i , and Iz = |0 


The subscript n is suppressed in case the order is clear from the context or if no confusion arises. 


6. A square matrix A = [a,,| is said to be an upper triangular matrix if aj; = 0 for 2 > j. 
A square matrix A = [a;,] is said to be an lower triangular matrix if a,j = 0 for i < 7. 


A square matrix A is said to be triangular if it is an upper or a lower triangular matrix. 


21 4 
For example |0 3 —1] is an upper triangular matrix. An upper triangular matrix will be represented 
0 0 -2 
G11 12 Qin 
O agg an 
by 
0 O +++) Ann 


1.2 Operations on Matrices 


Definition 1.2.1 (Transpose of a Matrix) The transpose of an m x n matrix A = [a;,] is defined as the 
nx m matrix B = [bij], with bj; = aj; for 1 <i < mand 1 <j <n. The transpose of A is denoted by A’. 


1.2. OPERATIONS ON MATRICES ah 


That is, by the transpose of an m x n matrix A, we mean a matrix of order n x m having the rows 


of A as its columns and the columns of A as its rows. 


1 0 

ill, +24 
For example, if A = a4 ; then A’ = [4 1 
5 2 


Thus, the transpose of a row vector is a column vector and vice-versa. 
Theorem 1.2.2 For any matrix A, we have (A‘)! = A. 
Proor. Let A = [aj;], A’ = [b;;] and (A‘)* = [c;]. Then, the definition of transpose gives 


Cij = bya = Ay for all a9 


and the result follows. 


Definition 1.2.3 (Addition of Matrices) let A = [a;;] and B = [b,;] be are two m x n matrices. Then the 
sum A+ B is defined to be the matrix C = [c,,;] with cj; = aj; + bj;. 


Note that, we define the sum of two matrices only when the order of the two matrices are same. 


Definition 1.2.4 (Multiplying a Scalar to a Matrix) Let A = [a,;;] be an m x n matrix. Then for any 
element & € R, we define kA = [kaj,]. 


1 4 5 


0 and k = 5, then 5A = 


For example, if A = 
0 5 10 


5 20 al 


Theorem 1.2.5 Let A, B and C be matrices of order m x n, and let k, £2 € R. Then 
1A+B=B+A (commutativity). 
2. (A+ B)+C=A+4+(B+C) (associativity). 
3. k(tA) = (ke)A. 
4. (kK+0)A=kA+0CA. 


Proor. Part{i] 
Let A= [aij] and B= [b;,]- Then 


A+ B= [aaj] + [big] = [aig + big] = [Big + aig] = [big] + [aig] = B+ A 


as real numbers commute. 
The reader is required to prove the other parts as all the results follow from the properties of real 


numbers. 


Exercise 1.2.6 1. Suppose A + B = A. Then show that B = 0. 


2. Suppose A+ B =O. Then show that B = (—1)A = [—ajj]. 


Definition 1.2.7 (Additive Inverse) Let A be an m x n matrix. 


1. Then there exists a matrix B with A + B = 0. This matrix B is called the additive inverse of A, and 
is denoted by —A = (—1)A. 


2. Also, for the matrix Omxn, A+0O=0+A = A. Hence, the matrix On x» is called the additive identity. 
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1.2.1. Multiplication of Matrices 


Definition 1.2.8 (Matrix Multiplication / Product) Let A = [a,;| be an m x n matrix and B = [};;] be 
ann Xx r matrix. The product AB is a matrix C' = [c;;] of order m x r, with 
n 
Cj = S> Aikbe; = ai1b15 + ai2b25 + +++ + Ginbn;. 
k=1 
Observe that the product AB is defined if and only if 
THE NUMBER OF COLUMNS OF A = THE NUMBER OF ROWS OF B. 


1 2 1 
1 2 8 
For example, if A = en 4 and B=]0 0 3] then 
1 0 4 


14+0+3 2+0+0 14+6+12 
2+04+1 44+04+0 241244 


AB = 


_ {4 2 19 

13° 4 18)" 

Note that in this example, while AB is defined, the product BA is not defined. However, for square 
matrices A and B of the same order, both the product AB and BA are defined. 


Definition 1.2.9 Two square matrices A and B are said to commute if AB = BA. 


Remark 1.2.10 1. Note that if A is a square matrix of order n then AI, = I,,A. Also for any d € R, 
the matrix dI, commutes with every square matrix of order n. The matrices dl, for any d € R 
are called SCALAR matrices. 


2. In general, the matrix product is not commutative. For example, consider the following two 


1 1 1 
matrices A = 0 0 and B= 1 3+ Then check that the matrix product 
PY beams A alee 
0 0 1 1 


Theorem 1.2.11 Suppose that the matrices A, B and C are so chosen that the matrix multiplications are 
defined. 


1. Then (AB)C = A(BC). That is, the matrix multiplication is associative. 

2. For any k © R, (kKA)B = k(AB) = A(KB). 

3. Then A(B + C) = AB + AC. That is, multiplication distributes over addition. 
4. lf Ais ann xX n matrix then AI, =1,A = A. 

5. For any square matrix A of order n and D = diag(dj, d2,...,dn), we have 


e the first row of DA is d; times the first row of A; 


e forl<i<n, the ith tow of DA is d; times the ith tow of A. 
A similar statement holds for the columns of A when A is multiplied on the right by D. 


PROOF. Part [i] Let A= [Qij]mxns B= [biglnxp and C= [cijlpxq- Then 


Pp n 
(BC)x; = S- brece; and (AB) iz = > Qindne- 


l=1 k=1 
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Therefore, 
n n P 
(A(BC)) ,, = y diz( BC) kj = .? Qik o> beece;) 

k=1 k=1 é=1 
n p n Pp 

= >, S- dik (beece;) = S| (aindee) ce; 
k=1 £=1 k=1 =1 
pon t 

= Yd anbisders = (AB) ge 
é=1 k=1 é=1 

a ((AB C)is: 


Part] For all 7 = 1,2,...,n, we have 


(DA); = S- dik Gr; = dai; 
k=1 


as diz = 0 whenever i 4 k. Hence, the required result follows. 


The reader is required to prove the other parts. 


Exercise 1.2.12 1. Let A and B be two matrices. If the matrix addition A+ B is defined, then prove 
that (A+ B)' = A’ + B’. Also, if the matrix product AB is defined then prove that (AB)' = BA’. 


by 
bg 
2. Let A = [a1,a2,...,@,] and B= | | | . Compute the matrix products AB and BA. 


bn 


3. Let n be a positive integer. Compute A” for the following matrices: 


111 111 
11 
G i dl}, 11 4 
0 1 
oo 2 i: a 


Can you guess a formula for A” and prove it by induction? 
4. Find examples for the following statements. 


(a) Suppose that the matrix product AB is defined. Then the product BA need not be defined. 


(b) Suppose that the matrix products AB and BA are defined. Then the matrices AB and BA can 
have different orders. 


(c) Suppose that the matrices A and B are square matrices of order n. Then AB and BA may or 
may not be equal. 
1.3. Some More Special Matrices 


Definition 1.3.1 1. A matrix A over R is called symmetric if At = A and skew-symmetric if A’ = —A. 


2. A matrix A is said to be orthogonal if AA’ = AtA = I. 


1 2 8 0 1 2 
Example 1.3.2 1. Let A= |2 4 —1] and B=J}-1 O —3]. Then A is a symmetric matrix and 
3 -l 4 —2 3 O 


B is a skew-symmetric matrix. 
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1 1 1 
V3 V3 V3 
2. Let A= A =o 0 |. Then A is an orthogonal matrix. 
1 1 2 
ve Ve Vo 
a 1 ifi=jtl 
3. Let A = [a;;] be ann x n matrix with a;; = . Then A” = Oand Ao ZO forl <l< 


0 otherwise 
n—1. The matrices A for which a positive integer k exists such that A* = 0 are called NILPOTENT 
matrices. The least positive integer k for which A” = 0 is called the ORDER OF NILPOTENCY. 


1 0 
4. Let A = f ; . Then A? = A. The matrices that satisfy the condition that A? = A are called 


IDEMPOTENT matrices. 


Exercise 1.3.3. 1. Show that for any square matrix A, S = 4(A + A‘) is symmetric, T = $(A— A’) is 
skew-symmetric, and A= S+T. 


2. Show that the product of two lower triangular matrices is a lower triangular matrix. A similar statement 


holds for upper triangular matrices. 
3. Let A and B be symmetric matrices. Show that AB is symmetric if and only if AB = BA. 
4. Show that the diagonal entries of a skew-symmetric matrix are zero. 
5. Let A, B be skew-symmetric matrices with AB = BA. Is the matrix AB symmetric or skew-symmetric? 
6. Let A be a symmetric matrix of order n with A? = 0. Is it necessarily true that A = 0? 


7. Let A be a nilpotent matrix. Show that there exists a matrix B such that B(I + A) =I = (I+ A)B. 


1.3.1 Submatrix of a Matrix 


Definition 1.3.4 A matrix obtained by deleting some of the rows and/or columns of a matrix is said to be 


a submatrix of the given matrix. 


1 4 5 


For example, if A = 
0 1 2 


| , a few submatrices of A are 


i), Bl A 115), E ; 


1 4 1 4 
But the matrices b 5‘ and f ; are not submatrices of A. (The reader is advised to give reasons.) 


A. 


o) 


Miscellaneous Exercises 


Exercise 1.3.5 1. Complete the proofs of Theorems [I.2.5] and [i.2.11] 


sinfO cos 


1 0 @ —siné 
; and B= es oe | . Geometrically interpret y = Ax 


3. Consider the two coordinate transformations 
Ly = 411Y1 + G12Y2 and Yu = 61121 + b1222 


LQ = G21y1 + G22Y2 Y = ba121 + b2222 
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(a) Compose the two transformations to express 21, X in terms of 21, 22. 


(b) If x’ = [ai, re], y’ = [y1, yo] and 2 = [z1, z2] then find matrices A,B and C such that 
x = Ay, y = Bz and x = Cz. 


(c) Is C = AB? 

4. For a square matrix A of order n, we define trace of A, denoted by tr (A) as 
tr (A) = aii + do2 ++ ++ Gan. 
Then for two square matrices, A and B of the same order, show the following: 

(a) tr (A+ B) =tr (A) +tr (B). 

(b) tr (AB) = tr (BA). 
5. Show that, there do not exist matrices A and B such that AB — BA = cl, for any c# 0. 
6. Let A and B be two m x n matrices and let x be an n x 1 column vector. 


(a) Prove that if Ax = 0 for all x, then A is the zero matrix. 


(b) Prove that if Ax = Bx for all x, then A = B. 


7. Let A be an n X n matrix such that AB = BA for all n x n matrices B. Show that A = al for some 
aeéER. 


1 2 

8. Let A= |2 1] . Show that there exist infinitely many matrices B such that BA = I. Also, show 
3 41 
S 


that there does not exist any matrix C’ such that AC = Is. 


1.3.1 Block Matrices 
Let A be an n x m matrix and B be an m x p matrix. Suppose r < m. Then, we can decompose the 


A 
matrices A and B as A = [P Q] and B= K ; where P has order n x r and H has order r x p. That 


is, the matrices P and Q are submatrices of A and P consists of the first 7 columns of A and Q consists 
of the last m—r columns of A. Similarly, H and K are submatrices of B and H consists of the first r 
rows of B and K consists of the last m —r rows of B. We now prove the following important theorem. 


Theorem 1.3.6 Let A = [a;;] = [P Q] and B = [b;;] = 


a be defined as above. Then 


AB = PH+QK. 


Proor. First note that the matrices PH and QK are each of order n x p. The matrix products PH 
and QK are valid as the order of the matrices P, H,@Q and K are respectively, n x r, r x p, n xX (m—r) 
and (m—r) x p. Let P = [Pi;], Q = [Qi;], H = [Hi,], and K = [kj]. Then, forl <<i<nand1l<j<p, 


we have 


(AB),; = So ainbes = So aindes + S- QikDkj 
b=1 nat 


k=r+1 


= So PrHagt+ SY) QinKnj 
k=1 k=r+1 
= (PH)ij + (QK)iz = (PH + QK);j. 
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Theorem [L.3.6] is very useful due to the following reasons: 
1. The order of the matrices P,Q, H and K are smaller than that of A or B. 


2. It may be possible to block the matrix in such a way that a few blocks are either identity matrices 
or zero matrices. In this case, it may be easy to handle the matrix product using the block form. 


3. Or when we want to prove results using induction, then we may assume the result for r x r 
submatrices and then look for (r + 1) x (r + 1) submatrices, etc. 


a b 
1 2 
For example, if A = 5 § ; and B=I|c d|, Then 
e f 
it 2 b 2 b+ 2d 
AB = a 4 0 efi= a+ 2c + 
2 5] |e d 0 2a+5c 2b+ 5d 


IfA= | 3 1 4 | , then A can be decomposed as follows: 


A= 3 1 4 and so on. 
—2 51|-8 
my, mg S$, $592 
Suppose A= nj, P Q| andB= 1 E F| . Then the matrices P, Q, R, S and 
no |R S ro |G H 


E, F, G, H, are called the blocks of the matrices A and B, respectively. 

Even if A+ B is defined, the orders of P and F may not be same and hence, we may not be able 
P+E Q+F 
R+G S+H|- 

Similarly, if the product AB is defined, the product PE need not be defined. Therefore, we can talk 
of matrix product AB as block product of matrices, if both the products AB and PE are defined. And 
PE+QG PF+QH 
RE+SG RF+SH|\- 

That is, once a partition of A is fixed, the partition of B has to be properly chosen for 


to add A and B in the block form. But, if A+ B and P+ E is defined then A+ B = 


in this case, we have AB = 


purposes of block addition or multiplication. 


Exercise 1.3.7 1. Compute the matrix product AB using the block matrix multiplication for the matrices 


1 O};0O 1 1 2] 2 #1 
= O 1]1 1 
0 1;1 +0 
0 1/0 1 
P Q : 
2. Let A = R sl- If P,Q, R and S are symmetric, what can you say about A? Are P,Q, Rand S$ 


symmetric, when A is symmetric? 
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3. 


Let A = [a,;] and B = [b,;;] be two matrices. Suppose aj, ag, ..., an are the rows of A and 
b,, be, ..., bp are the columns of B. If the product AB is defined, then show that 
a, B 
aoB 
AB = [Aby, Abo, ..., Ab,] = 
a,B 


[That is, left multiplication by A, is same as multiplying each column of B by A. Similarly, right 
multiplication by B, is same as multiplying each row of A by B.] 


1.4 Matrices over Complex Numbers 


Here the entries of the matrix are complex numbers. All the definitions still hold. One just needs to 


look at the following additional definitions. 


Definition 1.4.1 (Conjugate Transpose of a Matrix) 1. Let A be an mx matrix over C. If A = [a;,] 


2. 


5. 


6. 


then the Conjugate of A, denoted by A, is the matrix B = [b;;] with bj; = Giz. 


1 443% a 
0 1 4-2 


a 1 4-31 oh 
0 1 —i-2 


Let A be an m x n matrix over C. If A = [a;;] then the Conjugate Transpose of A, denoted by A’, is 
the matrix B = [b,,;] with bj; = Gi. 

1 44+3: 3 

0 1 i-2 


For example, Let A = 


| . Then 


For example, Let A = 


| . Then 


1 0 
A* = |4- 33 1 


—4 —1-2 


. A square matrix A over C is called Hermitian if A* = A. 


. A square matrix A over C is called skew-Hermitian if A* = —A. 


A square matrix A over C is called unitary if A*A = AA* = J. 


A square matrix A over C is called Normal if A4A* = A* A. 


Remark 1.4.2 If A = [ai,] with Qi; E R, then A* = Al. 


Exercise 1.4.3 1. Give examples of Hermitian, skew-Hermitian and unitary matrices that have entries 


2: 


3: 


4. 


with non-zero imaginary parts. 


Restate the results on transpose in terms of conjugate transpose. 


Show that for any square matrix A, S = Ae 


A=S+4+T. 


is Hermitian, 77 = AoA is skew-Hermitian, and 


Show that if A is a complex triangular matrix and AA* = A*A then A is a diagonal matrix. 
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CHAPTER 1. 


MATRICES 


Chapter 2 


Linear System of Equations 


2.1 Introduction 


Let us look at some examples of linear systems. 


1. Suppose a,b € R. Consider the system az = b. 


b 


a 


(a) If a £0 then the system has a UNIQUE SOLUTION & = 
(b) Ifa =0 and 
i. b #0 then the system has NO SOLUTION. 


ii. b= 0 then the system has INFINITE NUMBER OF SOLUTIONS, namely all x € R. 


2. We now consider a system with 2 equations in 2 unknowns. 
Consider the equation az + by = c. If one of the coefficients, a or b is non-zero, then this linear 


equation represents a line in R?. Thus for the system 
ayx+bhy=cy and agx+ boy = co, 


the set of solutions is given by the points of intersection of the two lines. There are three cases to 


be considered. Each case is illustrated by an example. 


(a) UNIQUE SOLUTION 
x+2y=1and x +3y=1. The unique solution is (x, y)’ = (1,0). 
Observe that in this case, a,b2 — agb; 4 0. 


(b) INFINITE NUMBER OF SOLUTIONS 
x+2y=1 and 2x 4+ 4y = 2. The set of solutions is (x, y)’ = (1 — 2y, y)’ = (1,0)? + y(—2, 1)! 
with y arbitrary. In other words, both the equations represent the same line. 
Observe that in this case, a1b2 — ab; = 0, ayc2 — agc, = 0 and bce — bec, = 0. 

(c) No SOLUTION 
x+2y =1 and 2x + 4y = 3. The equations represent a pair of parallel lines and hence there 
is no point of intersection. 
Observe that in this case, a,b2 — a2b; = 0 but ayco — agc, 4 0. 


3. As a last example, consider 3 equations in 3 unknowns. 
A linear equation ax + by + cz = d represent a plane in R® provided (a,b,c) 4 (0,0,0). As in the 
case of 2 equations in 2 unknowns, we have to look at the points of intersection of the given three 


planes. Here again, we have three cases. The three cases are illustrated by examples. 
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(a) UNIQUE SOLUTION 
Consider the system «+y+z=3, ©+4y+2z=7 and 4x+10y—z = 138. The unique solution 
to this system is (x,y, z)’ = (1,1,1)*; i.e. THE THREE PLANES INTERSECT AT A POINT. 

(b) INFINITE NUMBER OF SOLUTIONS 
Consider the system ¢+y+z2= 3, x+2y+2z =5 and 3x + 4y+ 4z = 11. The set of 
solutions to this system is (x, y, z)’ = (1,2 — z,z)’ = (1, 2,0) + 2(0, -1, 1)’, with z arbitrary: 
THE THREE PLANES INTERSECT ON A LINE. 

(c) No SOLUTION 
The system x+yt+z2=3, ©+2y+2z=5 and 32+ 4y + 4z = 13 has no solution. In this 
case, we get three parallel lines as intersections of the above planes taken two at a time. 


The readers are advised to supply the proof. 


2.2. Definition and a Solution Method 


Definition 2.2.1 (Linear System) A linear system of m equations in n unknowns 21, X2,...,2n is a set of 


equations of the form 


a11%1 + Gy2%Q4+++++A1nin = by 

42121 + G92%Q +++++Gan%n = by 
(2.2.1) 

Am1X1 + Am2%2 +++: +amntn = bm 


where for 1 < i <n, and 1 < 7 < m; ajj,b; € R. Linear System (2.2.1) is called HOMOGENEOUS if 
b; = 0 = bo =--+ = by, and NON-HOMOGENEOUS otherwise. 


We rewrite the above equations in the form Ax = b, where 


a11 G12 se: Gin Ty by 

a21 a22 0 0°"° a2n x i) 
A= j . : , X=]. }],andb= 

aml Am2 cae Amn In bm 


The matrix A is called the COEFFICIENT matrix and the block matrix [A b], is the AUGMENTED 
matrix of the linear system (2.2.1). 


h 


Remark 2.2.2 Observe that the i? row of the augmented matrix [A b] represents the it equation 


th 


and the j th column of the coefficient matrix A corresponds to coefficients of the j’~ variable x;. That 


is, for 1 <i<mand1< j <n, the entry a;; of the coefficient matrix A corresponds to the jth equation 


and jth variable x;.. 


For a system of linear equations Ax = b, the system Ax = 0 is called the ASSOCIATED HOMOGENEOUS 
SYSTEM. 


Definition 2.2.3 (Solution of a Linear System) A solution of the linear system Ax = b is a column vector 
y with entries y1, y2,.--,Yn such that the linear system (2.2.1) is satisfied by substituting y; in place of x;. 


That is, if y’ = [y1, ye,---, Yn] then Ay = b holds. 
Note: The zero n-tuple x = 0 is always a solution of the system Ax = 0, and is called the TRIVIAL 
solution. A non-zero n-tuple x, if it satisfies Ax = 0, is called a NON-TRIVIAL solution. 
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2.2.1 A Solution Method 


Example 2.2.4 Let us solve the linear system x + 7y +32 = 11, r+ y+2z=3, and 4x + 10y— z = 13. 


Solution: 
1. The above linear system and the linear system 


x+y+z =83 __ Interchange the first two equations. 
x+7y+3z =11 (2.2.2) 
4x+10y-—z =18 


have the same set of solutions. (why?) 


2. Eliminating x from 2nd and 3d equation, we get the linear system 


r+ty+tz =3 
6y+2z =8 (obtained by subtracting the first 
equation from the second equation.) 
6y—5z =1 (obtained by subtracting 4 times the first equation 
from the third equation.) (2.2.3) 


This system and the system (2.2.2) has the same set of solution. (why?) 


3. Eliminating y from the last two equations of system (2.2.3), we get the system 


crty+z =3 
6y+2z =8 
7z =T7_ obtained by subtracting the third equation 


from the second equation. (2.2.4) 


which has the same set of solution as the system (2.2.3). (why?) 
4. The system (2.2.4] and system 


ety+z =3 
3y+z =4 divide the second equation by 2 
z =1 . divide the second equation by 2 (2.2.5) 


has the same set of solution. (why?) 


4-1 
5. Now, z = 1 implies y= ——— =1 and x = 3—(1+1) =1. Or in terms of a vector, the set of solution 


is { (a, y,z)' : (a, y,z) = (1,1,1)}. 


2.3. Row Operations and Equivalent Systems 


Definition 2.3.1 (Elementary Operations) The following operations [I] [2] and 3] are called elementary op- 


erations. 


1. interchange of two equations, say “interchange the ith and jth equations’ ; 


(compare the system (2.2.2) with the original system.) 
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2. multiply a non-zero constant throughout an equation, say “multiply the th equation by c #0"; 


(compare the system (2.2.5) and the system (2.2.4)).) 


3. replace an equation by itself plus a constant multiple of another equation, say “replace the jth equation 


by jth equation plus c times the jth 


(compare the system (2.2.3) with (2.2.2) or the system (2.2.4) with (2.2.3).) 


equation”. 


Observations: 


1. In the above example, observe that the elementary operations helped us in getting a linear system 
(2.2.5), which was easily solvable. 


2. Note that at Step [1] if we interchange the first and the second equation, we get back to the linear 
system from which we had started. This means the operation at Step[I] has an inverse operation. 
In other words, INVERSE OPERATION sends us back to the step where we had precisely started. 
It will be a useful exercise for the reader to IDENTIFY THE INVERSE OPERATIONS at each step in 


Example 


So, in Example[2.2.4] the application of a finite number of elementary operations helped us to obtain 
a simpler system whose solution can be obtained directly. That is, after applying a finite number of 
elementary operations, a simpler linear system is obtained which can be easily solved. Note that the 
three elementary operations defined above, have corresponding INVERSE operations, namely, 


1. “interchange the ith and jth equations” , 


2. “divide the kth equation by c # 0”; 


h 


3. “replace the jth equation by jth equation minus c times the je equation” . 


It will be a useful exercise for the reader to IDENTIFY THE INVERSE OPERATIONS at each step in 


Example 2.2.4] 


Definition 2.3.2 (Equivalent Linear Systems) Two linear systems are said to be equivalent if one can be 
obtained from the other by a finite number of elementary operations. 


The linear systems at each step in Example[2.2-4]are equivalent to each other and also to the original 


linear system. 


Lemma 2.3.3 Let Cx = d be the linear system obtained from the linear system Ax = b by a single 
elementary operation. Then the linear systems Ax = b and Cx = d have the same set of solutions. 


PrRooF. We prove the result for the elementary operation “the jth equation is replaced by jth equation 


th equation.” The reader is advised to prove the result for other elementary operations. 


jth 


plus c times the 7 


In this case, the systems Ax = b and Cx = d vary only in the equation. Let (a1, Q2,...,Qn) 


be a solution of the linear system Ax = b. Then substituting for a;’s in place of x;’s in the th and jth 


equations, we get 
QK1Q1 + ApgQ2 +++ AknAn = br, and aj, Q1 + aj2a2 tere Ajnan = bj. 


Therefore, 
(x1 + Ca;1)a1 + (agg + Caj2)a2 +--+ + (Gen + Cjn)On = by + cb. (2.3.1) 


But then the 4th equation of the linear system Cx = d is 


(Gri + €a51)%1 + (ane + Caj2)%2 +++> + (kn + COjn)En = be + cb;. (2.3.2) 
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Therefore, using Equation (2.3.1), (a1, 2,..-, Qn) is also a solution for the ,th Equation (2.3.2). 


Use a similar argument to show that if (81, 62,...,G,) is a solution of the linear system Cx = d then 
it is also a solution of the linear system Ax = b. 


Hence, we have the proof in this case. 


Lemma [2.3.3] is now used as an induction step to prove the main result of this section (Theorem 


2.3.4). 


Theorem 2.3.4 Two equivalent systems have the same set of solutions. 


Proor. Let n be the number of elementary operations performed on Ax = b to get Cx = d. We prove 
the theorem by induction on n. 

If n = 1, Lemma [2.3.3] answers the question. If n > 1, assume that the theorem is true for n = m. 
Now, suppose n = m+1. Apply the Lemma[2Z.3.3)again at the “last step” (that is, at the (m+ 1)th step 
from the mth step) to get the required result using induction. 


Let us formalise the above section which led to Theorem[2.3.4] For solving a linear system of equa- 
tions, we applied elementary operations to equations. It is observed that in performing the elementary 
operations, the calculations were made on the COEFFICIENTS (numbers). The variables x1, 22,...,2n 
and the sign of equality (that is, “ =”) are not disturbed. Therefore, in place of looking at the system 
of equations as a whole, we just need to work with the coefficients. These coefficients when arranged in 
a rectangular array gives us the augmented matrix [A b]. 


Definition 2.3.5 (Elementary Row Operations) The elementary row operations are defined as: 
1. interchange of two rows, say “interchange the ith and jth rows”, denoted R,;; 
2. multiply a non-zero constant throughout a row, say “multiply the KD row by c#0", denoted Rz(c); 


3. replace a row by itself plus a constant multiple of another row, say “replace the tH row by kth row 


plus c times the jth row”, denoted Ry; (c). 


Exercise 2.3.6 Find the INVERSE row operations corresponding to the elementary row operations that have 
been defined just above. 


Definition 2.3.7 (Row Equivalent Matrices) Two matrices are said to be row-equivalent if one can be 
obtained from the other by a finite number of elementary row operations. 


Example 2.3.8 The three matrices given below are row equivalent. 


O 11 2 203 5 103 3 
=— 
2 0 38 5) Ri2lO0 1 1 2] Ri(1/2)}0 1 1 2 
111 3 1 11 8 111 3 
01 1 2 101 2 
Whereas the matrix |2. 0 3. 5] is not row equivalent to the matrix |0 2 3 5 
111 3 111 83 
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2.3.1 Gauss Elimination Method 


Definition 2.3.9 (Forward/Gauss Elimination Method) Gaussian elimination is a method of solving a 
linear system Ax = b (consisting of m equations in n unknowns) by bringing the augmented matrix 


Q11 a12 "°° Qin by 
a21 a22 °°" a2n be 
[A b] = 
Am1 aAm2 pia Amn bm 
to an upper triangular form 

Ci C12 t+ Cin | hh 

O ca2 +++ Can | do 

0 0 CS Cmn dm 


This elimination process is also called the forward elimination method. 


The following examples illustrate the Gauss elimination procedure. 


Example 2.3.10 Solve the linear system by Gauss elimination method. 


y+z2 = 2 
2e+3z = 5 
tt+yt+z = 3 
0 11 2 
Solution: In this case, the augmented matrix is |2 0 3 5]. The method proceeds along the fol- 
1 11 3 
lowing steps. 
1. Interchange 15 and 284 equation (or R12). 
2e+3z =5 2 0 3 °5 
rt+ytz =3 t- 1.3 


2. Divide the 18* equation by 2 (or R;(1/2)). 


r+he <3 Logs 
Yr =2 0 1 1 2 
gt+y+te =3 111 3 


3. Add —1 times the 15¢ equation to the 3° equation (or R31(—1)). 


nthe =3 10g 4 
yt2 =2 aes Ms 
ye =} area 
4. Add —1 times the 284 equation to the 3° equation (or R32(—1)). 

rth <3 10 9 § 
yte =2 01 1 2 
3 ne eee ae: 
qoe SSG OO ars 
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5. Multiply the grd equation by = (or R3(—3)). 


3 20. 3 5 
ytz =2 On 7h Fis 
z =1 0 0 1 1 


The last equation gives z = 1, the second equation now gives y = 1. Finally the first equation gives 
x = 1. Hence the set of solutions is (x, y, z)’ = (1,1,1)', A UNIQUE SOLUTION. 


Example 2.3.11 Solve the linear system by Gauss elimination method. 


et+y+z = 3 
e+22Qy+2z = 5 
3sa+4y+4z = 11 
1 11 3 
Solution: In this case, the augmented matrix is |1 2 2 5 | and the method proceeds as follows: 
3.4 4 11 


1. Add —1 times the first equation to the second equation. 


atytz =3 1 11 8 
ytz =2 011 2 
3a+4y+4z =11 3°94 4 Ii 
2. Add —3 times the first equation to the third equation. 
ytz = 1 1 2 
ytz =2 1 1 2 
3. Add —1 times the second equation to the third equation 
seaysP =3 a 
. i a ; a2 
z = 
mi 0 0 0 


Thus, the set of solutions is (a, y,z)* = (1,2 — z, z)’ = (1,2,0)' + z(0, -1,1)*, with z arbitrary. In other 
words, the system has INFINITE NUMBER OF SOLUTIONS. 


Example 2.3.12 Solve the linear system by Gauss elimination method. 


zt+ytz = 3 
e+22Qy+2z = 5 
3sa+4y+4z = 12 
1 11 3 
Solution: In this case, the augmented matrix is |1 2 2 5 | and the method proceeds as follows: 
3.4 4 12 


1. Add —1 times the first equation to the second equation. 


atytz =3 111 £3 
ytz S22 0 11 2 
3x0+4y+4z =12 3.4 4 12 
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2. Add —3 times the first equation to the third equation. 


rt+ytz =3 1 1 3 
yt+z =38 Le he 


3. Add —1 times the second equation to the third equation 


ytz =), 1 2 
0 =1 000 1 


The third equation in the last step is 
Ox + 0y+0z = 1. 


This can never hold for any value of x,y, z. Hence, the system has NO SOLUTION. 


Remark 2.3.13 Note that to solve a linear system, Ax = b, one needs to apply only the elementary 


row operations to the augmented matrix [A bl. 


2.4 Row Reduced Echelon Form of a Matrix 

Definition 2.4.1 (Row Reduced Form of a Matrix) A matrix C is said to be in the row reduced form if 
1. THE FIRST NON-ZERO ENTRY IN EACH ROW OF C'IS 1; 
2. THE COLUMN CONTAINING THIS 1 HAS ALL ITS OTHER ENTRIES ZERO. 


A matrix in the row reduced form is also called a ROW REDUCED MATRIX. 


Example 2.4.2 1. One of the most important examples of a row reduced matrix is the n x n identity 


matrix, I;,. Recall that the (i, jth entry of the identity matrix is 


01 0 -1 0 0104 0 
2. The matrices Doe! ee 0 and Pee PO are also in row reduced form. 
001 1 0 00110 
000 0 41 000 0 0 
1 00 0 5 
1 11 2 
3. The matrix ; is not in the row reduced form. (why?) 
00011 
000 0 0 


Definition 2.4.3 (Leading Term, Leading Column) For a row-reduced matrix, the first non-zero entry of 
any row is called a LEADING TERM. The columns containing the leading terms are called the LEADING 
COLUMNS. 


2.4. ROW REDUCED ECHELON FORM OF A MATRIX 27 


Definition 2.4.4 (Basic, Free Variables) Consider the linear system Ax = b in n variables and m equa- 
tions. Let [Cd] be the row-reduced matrix obtained by applying the Gauss elimination method to the 
augmented matrix [A bj]. Then the variables corresponding to the leading columns in the first n columns of 
[Cd] are called the BASIC variables. The variables which are not basic are called FREE variables. 


The free variables are called so as they can be assigned arbitrary values and the value of the basic 
variables can then be written in terms of the free variables. 
Observation: In Example [2.3.11] the solution set was given by 


(x, y,z)° = (1,2 —z,z)* = (1,2,0)' + 2(0,—-1,1)*, with z arbitrary. 
That is, we had two basic variables, x and y, and z as a free variable. 


Remark 2.4.5 It is very important to observe that if there are r non-zero rows in the row-reduced form 
of the matrix then there will be r leading terms. That is, there will be r leading columns. Therefore, 
IF THERE ARE r LEADING TERMS AND n VARIABLES, THEN THERE WILL BE 7 BASIC VARIABLES AND 
nm — 1 FREE VARIABLES. 


2.4.1 Gauss-Jordan Elimination 


We now start with Step] of Example 2.3.10] and apply the elementary operations once again. But this 
time, we start with the 31d row, 


I. Add —1 times the third equation to the second equation (or R23(—1)). 


pti =$ 1038 
y  =2 0101 
z =1 OO, 4 


II. Add =3 times the third equation to the first equation (or R13(—#)). 


c =1 1001 
y =1 6 te i 
z 0011 


II. From the above matrix, we directly have the set of solution as (a, y, z)’ = (1,1, 1)*. 


Definition 2.4.6 (Row Reduced Echelon Form of a Matrix) A matrix C is said to be in the row reduced 
echelon form if 


1. C is already in the row reduced form; 
2. The rows consisting of all zeros comes below all non-zero rows; and 


3. the leading terms appear from left to right in successive rows. That is, for 1 < ¢ < k, let ig be the 
leading column of the gth row. Then i, < ig < +++ < dg. 


Example 2.4.7 Suppose A = are in row reduced form. Then the 


a) 


1 0 2 000 1 
0 0 O| andB=]1 1 0 0 
0 11 0 0 0 0 


0 

0 

1 

0 1 0 2 11 
corresponding matrices in the row reduced echelon form are respectively, 0 1 1] and{0 O 

0 0 0 0 0 0 


SS: a SS: 
or Oo 
re Oo OG 


28 CHAPTER 2. LINEAR SYSTEM OF EQUATIONS 


Definition 2.4.8 (Row Reduced Echelon Matrix) A matrix which is in the row reduced echelon form is 


also called a row reduced echelon matrix. 


Definition 2.4.9 (Back Substitution/Gauss-Jordan Method) The procedure to get to Step[II]of Example 
2.3.10] from Step [5] of Example[2.3.10]is called the back substitution. 
The elimination process applied to obtain the row reduced echelon form of the augmented matrix is called 


the Gauss-Jordan elimination. 


That is, the Gauss-Jordan elimination method consists of both the forward elimination and the backward 
substitution. 

Method to get the row-reduced echelon form of a given matrix A 
Let A be an m x n matrix. Then the following method is used to obtain the row-reduced echelon form 
the matrix A. 


Step 1: Consider the first column of the matrix A. 


If all the entries in the first column are zero, move to the second column. 


Else, find a row, say jth 


row, which contains a non-zero entry in the first column. Now, interchange 
the first row with the it) row. Suppose the non-zero entry in the (1, 1)-position is a 4 0. Divide 
the whole row by a so that the (1, 1)-entry of the new matrix is 1. Now, use the 1 to make all the 


entries below this 1 equal to 0. 

Step 2: If all entries in the first column after the first step are zero, consider the right m x (n — 1) 
submatrix of the matrix obtained in step 1 and proceed as in step 1. 
Else, forget the first row and first column. Start with the lower (m— 1) x (n—1) submatrix of the 


matrix obtained in the first step and proceed as in step 1. 


Step 3: Keep repeating this process till we reach a stage where all the entries below a particular row, 
say rT, are zero. Suppose at this stage we have obtained a matrix C. Then C has the following 


form: 
1. THE FIRST NON-ZERO ENTRY IN EACH ROW of C is 1. These 1’s are the leading terms of C’ 
and the columns containing these leading terms are the leading columns. 
2. THE ENTRIES OF C’ BELOW THE LEADING TERM ARE ALL ZERO. 


h 


Step 4: Now use the leading term in the rth row to make all entries in the rth leading column equal 


to zero. 


Step 5: Next, use the leading term in the (r — 1)th row to make all entries in the (r — 1)th leading 


column equal to zero and continue till we come to the first leading term or column. 


The final matrix is the row-reduced echelon form of the matrix A. 
Remark 2.4.10 Note that the row reduction involves only row operations and proceeds from LEFT TO 


RIGHT. Hence, if A is a matrix consisting of first s columns of a matrix C, then the row reduced form 


of A will be the first s columns of the row reduced form of C. 


The proof of the following theorem is beyond the scope of this book and is omitted. 
Theorem 2.4.11 The row reduced echelon form of a matrix is unique. 


Exercise 2.4.12 — 1. Solve the following linear system. 
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(a) e+yt+z2+w=0, c-—yt+z2+w=0and -7+y+32+3w =0. 
(b) «+ 2y+3z=1 and «+ 3y+2z=1. 

(c) e+tyt+2=3, e+y—z=landxt+y+7z=6. 

(d) e+y+2=3, ¢e+y-—z=landa+y+4z2=6. 

(e) e+tyt2=3, e+ y—2=1, e+y4+42 =6anda+y—4z2=-1. 


2. Find the row-reduced echelon form of the following matrices. 


-1 1 3 #5 10 «8 6 4 
i 1 3. 5 7 9 2 0 -2 -4 

9 11 13 15]’ ; 6 8 10 12 

-3 -1 138 2 A 6 8 


2.4.2 Elementary Matrices 


Definition 2.4.13 A square matrix FE of order n is called an elementary matrix if it is obtained by 
applying exactly one elementary row operation to the identity matrix, Ip. 


Remark 2.4.14 There are three types of elementary matrices. 


1. E,;, which is obtained by the application of the elementary row operation R,; to the identity 
1 ifk=landlFi,j 
matrix, I,. Thus, the (k, eth entry of Ej; is (Eij)(n,2) = 41 if (k, 2) = (4,9) or (k, 2) = G4) - 
0 otherwise 


2. E;,(c), which is obtained by the application of the elementary row operation R;,(c) to the identity 
1 ifti=jandiF¥k 
matrix, I,,. The (i, j)th entry of E;,(c) is (Ex(e))a,j) = 4e ifi=j=ak 
0 otherwise 


3. Ej;(c), which is obtained by the application of the elementary row operation R;;(c) to the identity 
1 ifk=€ 
matrix, In. The (k, €)"" entry of Eij(c) is (Eij)(n § ¢ if (k, 0) = (i, 9) - 
0 otherwise 


In particular, 


1 0 0 c 0 O 1 0 0 
Eo4,=]0 0 1], Ax(c)=]0 1 O},7 and £y3(c)=]0 1 c 
0 1 0 0 0 1 0 0 1 
1 2 3 0 
Example 2.4.15 1. Lett A= ]2 0 3 4]. Then 
4 5 6 
1 2 3 °0 1 2 3 0 1 0 0 
20 3 4|R3/3 4 5 6|=|0 0 1| A= EesA. 
3 4 5 6 2 0 3 4 0 1 0 


That is, interchanging the two rows of the matrix A is same as multiplying on the left by the corre- 
sponding elementary matrix. In other words, we see that the left multiplication of elementary matrices 
to a matrix results in elementary row operations. 
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011 
2. Consider the augmented matrix [A b] = |2 0 3 . Then the result of the steps given below is 
TY es“ 


wm on 


same as the matrix product 


Eo3(—1) F12(—1) E3 (1/3) £39 (2) £3 Fi (—2) F13[A bl. 


0112 Ld 111 3 1 1 1 8 ue 1 1 1 8 
2-0 3°25 Ri3 2 5| Ra(—-2)}0 -2 1 -1| Ro3/0 1 1 2 
111 3 0112 0 1 1 2 -2 1 -1 
111 3 111 3 1001 
—— 
R32(2) 0 1 1 2] Rs(1/3)]0 1 1 2] Riw(-1)]0 1 1 2 
00 3 38 0011 0011 
1001 
——> 
Ro3(-1) 10 1 0 1 
0011 


Now, consider an m x n matrix A and an elementary matrix E of order n. Then multiplying by E 
on the right to A corresponds to applying column transformation on the matrix A. Therefore, for each 


elementary matrix, there is a corresponding column transformation. We summarize: 


Definition 2.4.16 The column transformations obtained by right multiplication of elementary matrices are 


called elementary column operations. 


L233 
Example 2.4.17 Let A = 0 | and consider the elementary column operation f which interchanges 
3.4 ~=5 


<3. 2 1 0 0 
the second and the third column of A. Then f(A) = ]2 3 O} =A/O O 1] = AEFy;. 
3°05 4 0 1 0 


Exercise 2.4.18 1. Let e be an elementary row operation and let EL’ = e(J) be the corresponding ele- 
mentary matrix. That is, & is the matrix obtained from J by applying the elementary row operation e. 
Show that e(A) = EA. 


2. Show that the Gauss elimination method is same as multiplying by a series of elementary matrices on 
the left to the augmented matrix. 


Does the Gauss-Jordan method also corresponds to multiplying by elementary matrices on the left? 


Give reasons. 


3. Let A and B be two m x n matrices. Then prove that the two matrices A, B are row-equivalent if and 
only if B = PA, where P is product of elementary matrices. When is this P unique? 


2.5 Rank of a Matrix 


In previous sections, we solved linear systems using Gauss elimination method or the Gauss-Jordan 
method. In the examples considered, we have encountered three possibilities, namely 


1. existence of a unique solution, 


2. existence of an infinite number of solutions, and 
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3. no solution. 


Based on the above possibilities, we have the following definition. 


Definition 2.5.1 (Consistent, Inconsistent) A linear system is called CONSISTENT if it admits a solution 
and is called INCONSISTENT if it admits no solution. 


The question arises, as to whether there are conditions under which the linear system Ax = b is 
consistent. The answer to this question is in the affirmative. To proceed further, we need a few definitions 
and remarks. 

Recall that the row reduced echelon form of a matrix is unique and therefore, the number of non-zero 
rows is a unique number. Also, note that the number of non-zero rows in either the row reduced form 
or the row reduced echelon form of a matrix are same. 


Definition 2.5.2 (Row rank of a Matrix) The number of non-zero rows in the row reduced form of a 
matrix is called the row-rank of the matrix. 


By the very definition, it is clear that row-equivalent matrices have the same row-rank. For a matrix A, 
we write ‘row-rank (A)’ to denote the row-rank of A. 


121 
Example 2.5.3 1. Determine the row-rank of A= |2 3 1 
1 1 2 
Solution: To determine the row-rank of A, we proceed as follows. 
1 2-1 1 2 1 
(a) 2 3 1 Rul(—2) Ra 0 -1 -l 
1 1 2 0 -1 1 
1 2 1 121 
(by |O. =) <2 Ra Tse) 011 
0 -1 1 0 0 2 
12 1 1 0 -1l 
(c) ]O 1 1 R3(1/2), Ria(—2) 01 1 
0 0 2 00 1 
1 0 -1 1 0 0 
(4) }o 1 1 Ral(—0), iad) 01 0 
0 0 0 0 1 


The last matrix in Step[Id]is the row reduced form of A which has 3 non-zero rows. Thus, row-rank(A) = 3. 
This result can also be easily deduced from the last matrix in Step [1b] 


12 1 
2. Determine the row-rank of A= /2 3 1 
1 1 0 


Solution: Here we have 


1 2 4 1 2 1 
(a) 2 3 Rao) ea 0 -1 -1 
1 1 0 0 -1 -1l 


2 1 


1 i 
(b) }O —1 —1 Ra(—1), Rea(1) 011 
0 6 0.0 


-1 -l 
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From the last matrix in Step [2b] we deduce row-rank(A) = 2. 


Remark 2.5.4 Let Ax = b bea linear system with m equations and n unknowns. Then the row-reduced 
echelon form of A agrees with the first n columns of [A b], and hence 


row-rank(A) < row-rank([A b)). 


The reader is advised to supply a proof. 


Remark 2.5.5 Consider a matrix A. After application of a finite number of elementary column oper- 
ations (see Definition to the matrix A, we can have a matrix, say B, which has the following 
properties: 


1. The first nonzero entry in each column is 1. 
2. A column containing only 0’s comes after all columns with at least one non-zero entry. 


3. The first non-zero entry (the leading term) in each non-zero column moves down in successive 


columns. 


Therefore, we can define column-rank of A as the number of non-zero columns in B. It will be 
proved later that 


row-rank(A) = column-rank(A). 


Thus we are led to the following definition. 


Definition 2.5.6 The number of non-zero rows in the row reduced form of a matrix A is called the rank of 
A, denoted rank (A). 


Theorem 2.5.7 Let A be a matrix of rank r. Then there exist elementary matrices £), 2,...,H, and 
F,, Fo,..., Fe such that 

I, 0 

0 Of- 


PrRooF. Let C’ be the row reduced echelon matrix obtained by applying elementary row operations to 


PE, by...B£, AFF)... Fo = 


the given matrix A. As rank(A) =r, the matrix C' will have the first r rows as the non-zero rows. So by 
Remark [2.4.5] C' will have r leading columns, say 71, i2,...,i,. Note that, for 1 < s <r, the ith column 


th row and zero elsewhere. 


will have 1 in the s 
We now apply column operations to the matrix C. Let D be the matrix obtained from C’ by succes- 
sively interchanging the sth and ith column of C' for 1 < s < r. Then the matrix D can be written in the 


i 


form , where B is a matrix of appropriate size. As the (1,1) block of D is an identity matrix, 


the block (1,2) can be made the zero matrix by application of column operations to D. This gives the 


required result. 


Exercise 2.5.8 1. Determine the ranks of the coefficient and the augmented matrices that appear in Part 
[I]and Part [2] of Exercise [2.4.12 


2. For any matrix A, prove that rank(A) = rank(A‘). 


3. Let A be an n x nm matrix with rank(A) =n. Then prove that A is row-equivalent to I,. 
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2.6 Existence of Solution of Ax = b 


We try to understand the properties of the set of solutions of a linear system through an example, using 
the Gauss-Jordan method. Based on this observation, we arrive at the existence and uniqueness results 


for the linear system Ax = b. This example is more or less a motivation. 


2.6.1 Example 


Consider a linear system Ax = b which after the application of the Gauss-Jordan method reduces to a 
matrix [Cd] with 


102 -1 00 2 8 
011 3 00 5 1 
IC d= 000 0 1 0 -1 2 
000 0 01 1 4 
000 0 00 0 0 
000 0 00 0 0 


For this particular matrix [Cd], we want to see the set of solutions. We start with some observations. 
Observations: 


1. The number of non-zero rows in C’ is 4. This number is also equal to the number of non-zero rows 
in [C d]. 


2. The first non-zero entry in the non-zero rows appear in columns 1, 2,5 and 6. 


3. Thus, the respective variables 71, 72,75 and 26 are the basic variables. 


4. The remaining variables, x3, 24 and x7 are free variables. 


5. We assign arbitrary constants k,,k2 and kz to the free variables x3, x4 and 2x7, respectively. 


Hence, we have the set of solutions as 


ty 8 — Qhy + ky — Qks 

xr 1— ky — 3k2 — 5kg 

L3 ky 

ra} = kp 

rs 2+ kg 

x6 4—kg 

x7 k3 
8 —2 1 —2 
1 —1 —3 —5 
0 1 0 0 

= O} +k, | 0] +ho} 1] +k] 0], 

2 0 0 1 
4 0 0 -1 
0 0 0 1 
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where ky, kg and kg are arbitrary. 


8 —2 1 —2 
1 —1 —3 —5 
0 1 0 

Let up = 10], u = |] 0], ve=] 1] andus= | 0 
2 0 0 1 
4 0 0 —1 
0 0 0 1 


Then it can easily be verified that Cug = d, and for 1 <7 <3, Cu; =0. 
A similar idea is used in the proof of the next theorem and is omitted. The interested readers can 
read the proof in Appendix [14.1] 


2.6.2 Main Theorem 


Theorem 2.6.1 [Existence and Non-existence] Consider a linear system Ax = b, where A is a mx n matrix, 
and x, bare vectors with orders n x 1, and mx 1, respectively. Suppose rank (A) = r and rank([A b]) = ra. 
Then exactly one of the following statement holds: 


1. if rg =7r <n, the set of solutions of the linear system is an infinite set and has the form 
{uo + kiuy + koug +--+ +kyp-pUn_p : ky ER, 1 <i<n-—r}, 
where Uo, U1,.-..,U,_, are n x 1 vectors satisfying Aup = b and Au; =O for 1 <i<n-—r. 
2. if ra =r =N, the solution set of the linear system has a unique n x 1 vector xo satisfying Axo = b. 


3. If r <1q, the linear system has no solution. 


Remark 2.6.2 Let A be an m x n matrix and consider the linear system Ax = b. Then by Theorem 
2.6.1} we see that the linear system Ax = b is consistent if and only if 


rank (A) = rank({A b)). 
The following corollary of Theorem |2.6.1}is a very important result about the homogeneous linear 


system Ax = 0. 


Corollary 2.6.3 Let A be an mx n matrix. Then the homogeneous system Ax = 0 has a non-trivial solution 
if and only if rank(A) <n. 


PROOF. Suppose the system Ax = 0 has a non-trivial solution, xp. That is, Axp = 0 and xo 4 0. Under 
this assumption, we need to show that rank(A) <n. On the contrary, assume that rank(A) = n. So, 


n = rank(A) = rank([A O]) = ra. 


Also AO = O implies that 0 is a solution of the linear system Ax = 0. Hence, by the uniqueness of the 
solution under the condition r = rz =n (see Theorem[2.6.1), we get x9 = 0. A contradiction to the fact 
that Xo was a given non-trivial solution. 

Now, let us assume that rank(A) < n. Then 


rq =rank([A 0]) = rank(A) <n. 


So, by Theorem [2.6.1] the solution set of the linear system Ax = 0 has infinite number of vectors x 
satisfying Ax = 0. From this infinite set, we can choose any vector Xo that is different from 0. Thus, we 


have a solution x9 4 0. That is, we have obtained a non-trivial solution xo. 


We now state another important result whose proof is immediate from Theorem and Corollary 
2.6.3 
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Proposition 2.6.4 Consider the linear system Ax = b. Then the two statements given below cannot hold 
together. 


1. The system Ax = b has a unique solution for every b. 


2. The system Ax = 0 has a non-trivial solution. 


Remark 2.6.5 1. Suppose x1,X2 are two solutions of Ax = 0. Then k,x, + k2X2 is also a solution 
of Ax = 0 for any k,,k2 € R. 


2. If u,v are two solutions of Ax = b then u — v is a solution of the system Ax = 0. That is, 
u —v = xX» for some solution x, of Ax = 0. That is, any two solutions of Ax = b differ by a 
solution of the associated homogeneous system Ax = 0. 


In conclusion, for b # 0, the set of solutions of the system Ax = b is of the form, {xo + xn}; where 
Xo is a particular solution of Ax = b and x, is a solution Ax = 0. 
2.6.3 Exercises 


Exercise 2.6.6 1. For what values of c and k-the following systems have 7) no solution, iz) a unique 
solution and iz) infinite number of solutions. 


(a) c+y+2=3, c+2y+ez=4, 2a+3y 4+ 2cz =k. 
(b) e+y+2=3, c+ y4+2cz =7, ©+2y4+3cz=k. 
(c) e+yt+2z2=3, e+ 2y4+cez=5, + 2y4+4z=k. 
(d) ket+ytz=1, ¢+ky+z2=1, e+y+kze=1. 

(e) e+ 2y—z=1, 22+ 3y+kze =3, c+ ky+3z=2. 
(f) e—2y=1, e-ytkz=1, ky+42=6. 


2. Find the condition on a,b,c so that the linear system 
a+ 2y—3z=a, 244+ 6y—1lz=b, r-2y+7z=c 
is consistent. 


3. Let A be an n x n matrix. If the system A?x = 0 has a non trivial solution then show that Ax = 0 
also has a non trivial solution. 


2.7 Invertible Matrices 


2.7.1 Inverse of a Matrix 

Definition 2.7.1 (Inverse of a Matrix) Let A be a square matrix of order n. 
1. A square matrix B is said to be a LEFT INVERSE of A if BA = I. 
2. A square matrix C is called a RIGHT INVERSE of A, if AC = I,,. 


3. A matrix A is said to be INVERTIBLE (or is said to have an INVERSE) if there exists a matrix B such 
that AB = BA =I,. 


Lemma 2.7.2 Let A be an n x n matrix. Suppose that there exist n x n matrices B and C such that 
AB=TI, and CA=I,, then B=C. 
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PrRooF. Note that 
C=ClI, = C(AB) = (CA)B=1,B=B. 


Remark 2.7.3 1. From the above lemma, we observe that if a matrix A is invertible, then the inverse 


is unique. 


2. As the inverse of a matrix A is unique, we denote it by A~!. That is, AA~' = A~!A=T. 


Theorem 2.7.4 Let A and B be two matrices with inverses A~! and B~, respectively. Then 
i. (A) = A, 
0 (AB) += BOA, 
3. prove that (A*)~! = (A71)é. 


Proor. Proof of Part [I] 
By definition AA~' = A~!A = I. Hence, if we denote A~! by B, then we get AB = BA = I. This again 
by definition, implies B~' = A, or equivalently (A~!)~! = A. 
Proof of Part 2] 
Verify that (AB)(B~1A~!) = I = (B~1A~')(AB). Hence, the result follows by definition. 
Proof of Part 3] 
We know AA~! = A~1A=T. Taking transpose, we get 


(AA-1)* = (A214)? = Tt > (A72)¢At = ALAM?) ST. 


Hence, by definition (A‘)~! = (A71)é. 


Exercise 2.7.5 1. If A is a symmetric matrix, is the matrix 4~! symmetric? 


2. Show that every elementary matrix is invertible. Is the inverse of an elementary matrix, also an ele- 


mentary matrix? 


3. Let A,, Ao,..., A, be invertible matrices. Prove that the product A;A»2---A, is also an invertible 
matrix. 


4. lf P and Q are invertible matrices and PAQ is defined then show that rank (PAQ) = rank (A). 


5. Find matrices P and Q which are product of elementary matrices such that B = PAQ where A = 


2 4 
8 and B= a Oh. 
1 3 2 01 (0 


6. Let A and B be two matrices. Show that 
(a) if A+B is defined, then rank(A + B) < rank(A) + rank(B), 
(b) if AB is defined, then rank(AB) < rank(A) and rank(AB) < rank(B). 
7. Let A be any matrix of rank 7. Then show that there exists invertible matrices B;,C; such that 
A I, O 
me te , ACL = mt 1 0 , and B3AC3 = . Also, prove 
0 O Ss 0 0 0 


that the matrix A, is an r x r invertible matrix. 


BiA= ‘ ; By AC> = 
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8. Let A be an m x n matrix of rank r. Then A can be written as A = BC, where both B and C have 
rank r and B is a matrix of size m x r and C is a matrix of size r x n. 


9. Let A and B be two matrices such that AB is defined and rank (A) = rank (AB). Then show that 
A= ABX for some matrix X. Similarly, if BA is defined and rank (A) = rank (BA), then A= YBA 


for some matrix Y. /Hint: Choose non-singular matrices P,Q and R such that PAQ = be : and 
0 “tA O 
P(AB)R= o| Define X =R . 4 ‘ : omen 


10. Let A = [a;;] be an invertible matrix and let B = [p’~/aj;;| for some nonzero real number p. Find the 


inverse of B. 


11. If matrices B and C are invertible and the involved partitioned products are defined, then show that 


0 cut 
B-) —B-'Ac7} 


A B 
Cc OO 


12. Suppose A is the inverse of a matrix B. Partition A and B as follows: 


Au Aj2 
Agi Azo 


By, Biz 
Bo, Boo 


A= a . 


9 


If Ay, is invertible and P = Ago — Aoi (Ay; Ai2), then show that 


By = Ay! + (Aq Aiz)P7! (Ao Aq), Bor =—P7'(An Aq!), Bis =—(Aq Ai2)P™, 


and Boz = Po. 


2.7.2 Equivalent conditions for Invertibility 


Definition 2.7.6 A square matrix A or order n is said to be of full rank if rank (A) =n. 


Theorem 2.7.7 For a square matrix A of order n, the following statements are equivalent. 
1. A is invertible. 
2. A is of full rank. 
3. A is row-equivalent to the identity matrix. 
4. Ais a product of elementary matrices. 


Proor. I] BP] 
Let if possible rank(A) = r < n. Then there exists an invertible matrix P (a product of elementary 


B, B 
matrices) such that PA = : : , where B, is anrxr matrix. Since A is invertible, let A~! = ' 
2 
where C is an r X n matrix. Then 
By Bo| |C ByiC) + BoC 
PoP pase | | | ee (2.7.1) 
0 0 C2 0 


Thus the matrix P has n — r rows as zero rows. Hence, P cannot be invertible. A contradiction to P 
being a product of invertible matrices. Thus, A is of full rank. 


2] — B) 
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Suppose A is of full rank. This implies, the row reduced echelon form of A has all non-zero rows. 
But A has as many columns as rows and therefore, the last row of the row reduced echelon form of A 
will be (0,0,...,0,1). Hence, the row reduced echelon form of A is the identity matrix. 

3] 4] 

Since A is row-equivalent to the identity matrix there exist elementary matrices F1, E2,...,£% such 
that A = FE, Ey---E,I,. That is, A is product of elementary matrices. 

— 


Suppose A = FE £2 --- Ex; where the F;’s are elementary matrices. We know that elementary matrices 


are invertible and product of invertible matrices is also invertible, we get the required result. 


The ideas of Theorem [2.7.7] will be used in the next subsection to find the inverse of an invertible 
matrix. The idea used in the proof of the first part also gives the following important Theorem. We 
repeat the proof for the sake of clarity. 


Theorem 2.7.8 Let A be a square matrix of order n. 
1. Suppose there exists a matrix B such that AB = I,,. Then A7 exists. 
2. Suppose there exists a matrix C’ such that CA = I,,. Then A7! exists. 


ProoFr. Suppose that AB = I,,. We will prove that the matrix A is of full rank. That is, rank (A) = n. 
Let if possible, rank(A) = r <n. Then there exists an invertible matrix P (a product of elementary 


Cy, C B 
matrices) such that PA = : oe . , where B, is an r X n matrix. Then 
2 
Ci Co| |B C, By + CoB 
P=PL APU ope a lea ee (2.7.2) 
O O| |Be 0 


Thus the matrix P has n —r rows as zero rows. So, P cannot be invertible. A contradiction to P being 
a product of invertible matrices. Thus, rank (A) = n. That is, A is of full rank. Hence, using Theorem 
[2.7.7] A is an invertible matrix. That is, BA = I, as well. 

Using the first part, it is clear that the matrix C in the second part, is invertible. Hence 


AC =I, =CA. 


Thus, A is invertible as well. 
Remark 2.7.9 This theorem implies the following: “if we want to show that a square matrix A of order 
n is invertible, it is enough to show the existence of 

1. either a matrix B such that AB = I, 


2. or a matrix C such that CA = In. 


Theorem 2.7.10 The following statements are equivalent for a square matrix A of order n. 
1. A is invertible. 
2. Ax = 0 has only the trivial solution x = 0. 


3. Ax =b has a solution x for every b. 
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Proor. fJ—= 

Since A is invertible, by Theorem[2.7.7] A is of full rank. That is, for the linear system Ax = 0, the 
number of unknowns is equal to the rank of the matrix A. Hence, by Theorem[2.6.1] the system Ax = 0 
has a unique solution x = 0. 

= 

Let if possible A be non-invertible. Then by Theorem [2.7.7] the matrix A is not of full rank. Thus 
by Corollary [2.6.3] the linear system Ax = 0 has infinite number of solutions. This contradicts the 
assumption that Ax = 0 has only the trivial solution x = 0. 

= [3 

Since A is invertible, for every b, the system Ax = b has a unique solution x = A~'b. 


BH 


For 1 <i <n, define e; = (0,...,0, 1 ,0,...,0)*, and consider the linear system Ax = ej. 
jth position 
By assumption, this system has a solution x; for each 7, 1 <i <n. Define a matrix B = [x, X2,...,Xn]. 


That is, the jth column of B is the solution of the system Ax = e;. Then 


AB= A[xX1,X2---,Xn| = [Ax,, Ax ...;AXn| = [e1,€2...,en] = Tye 


Therefore, by Theorem [2.7.8] the matrix A is invertible. 


Exercise 2.7.11 — 1. Show that a triangular matrix A is invertible if and only if each diagonal entry of A 


is non-zero. 


2. Let A be a 1 x 2 matrix and B be a 2 x 1 matrix having positive entries. Which of BA or AB is 


invertible? Give reasons. 
3. Let A be an n X m matrix and B be an m X n matrix. Prove that the matrix I — BA is invertible if 
and only if the matrix J — AB is invertible. 
2.7.3. Inverse and Gauss-Jordan Method 


We first give a consequence of Theorem and then use it to find the inverse of an invertible matrix. 


Corollary 2.7.12 Let A be an invertible n x nm matrix. Suppose that a sequence of elementary row-operations 
reduces A to the identity matrix. Then the same sequence of elementary row-operations when applied to the 


identity matrix yields A7!. 


Proor. Let A be a square matrix of order n. Also, let F,, F2,..., LE, be a sequence of elementary row 
operations such that BE, £2---E,A = I,. Then Ey E2--- ExIn = A7!. This implies A~! = E, Ea--- Ep. 


Summary: Let A be an n x n matrix. Apply the Gauss-Jordan method to the matrix [A I,]. 
Suppose the row reduced echelon form of the matrix [A I,] is [B C]. If B =I, then A~' = C or else 
A is not invertible. 


21 £1 
Example 2.7.13 Find the inverse of the matrix |1 2 1] using Gauss-Jordan method. 
1 1 2 
2114100 
Solution: Consider the matrix |1 2 1 0 1 OJ| .A sequence of steps in the Gauss-Jordan method 
112001 


are: 
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2114100 Dee Jee ee 60 
——> 
de. VI dO. dO) Aye) |i 2° a. 20 0 
Be hy Bese 2040 ‘tae ae it 
1h 4400 1d 4 £00 
Roi(-l 3 1 
2 WD es oa he. 0) a(t) i 02 5 -4 10 
31\— 
1120041 0% # -$ 01 
Psa op OO Ig a? aa 0 
3. /0 2 $ -% 1 O} Ro(2/3)]0 1 4 -F 2 
05 # -$ 01 0 4 3 -£ 0 
I te BE og 0 qe wh. eh ab 0 O 
2 2 2 2 2 2 
4.1/0 1 = -3 2 0 Rsa(—1/2) 0 14 -3 = 0 
05 ¢ -$ 0 1 00 % -% -%; 1 
133 3 0 7 : 2 27 9 0 
5. 10 1 $ -% 2 O| R3(3/4)10 1 § -% 2 O 
004 -§ -$ 1] [oo 1 -+ -4 § 
12 4 ££ 0 0 139 5 21 33 
Sloot eet. 2/56) PRG. 4 3p. 2. 3 
: oe og 8 SG Ris(—1/2 coe ees 
O10. as pa gE OO: aet ee 
Oe ee ie a Co: O38 <a et 
7 {0 10 = £ +S) Ra(-1/2)}0 10 4S 2%? S| 
oo1 3 4 3 Coa ie oar ae 


3/4 -1/4 —1/4 
8. Thus, the inverse of the given matrix is |—1/4 3/4 -—1/4 
1/4 -1/4 3/4 


Exercise 2.7.14 Find the inverse of the following matrices using Gauss-Jordan method. 


2 3 ee ee a hae 
(li Be Bae PBS Gi et. eB 8 
2-49 pe a) 2 4 1 


2.8 Determinant 


Notation: For an n x n matrix A, by A(a|Z), we mean the submatrix B of A, which is obtained by 
deleting the ath row and gth column. 


. Then A(1|2) = e : 


Be WwW bd 
NW W 


1 
Example 2.8.1 Consider a matrix A = |1 
2 
A(1, 2|1,3) = [4]. 
Definition 2.8.2 (Determinant of a Square Matrix) Let A be a square matrix of order n. With A, we 
associate inductively (on n) a number, called the determinant of A, written det(A) (or |A]) by 


if A = [a] (n= 1), 


a 
det(A) = Yo (-)! Mai; det (A(1|J)), otherwise. 
j=l 
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Definition 2.8.3 (Minor, Cofactor of a Matrix) The number det (A(i|j)) is called the (i, sth minor of 
A. We write A;; = det (A(i|j)). The (i, 7)*" cofactor of A, denoted Ci;, is the number (—1)'*9 A;;. 


Example 2.8.4 1. Let A= C pis . Then, det (A) = | A| = ay A141 = a42Aj2 = 411422 — 412491. 


a21 422 


For example, for A = 


2 
] det(A) = |4| 1-29 = 3, 


411 412 413 
om Let A = 121 a22 a23) - Then, 


431 432 433 


det(A) = |A] = ay1A11 — a12Ai2 + €13A13 
a22 423 G21 423 a21 422 
= ayy — a12 + 413 
432 433 431 433 431 432 


= 11 (422433 _ 423432) — @12(421433 — 31423) a 13(@21432 _ 31422) 


= 411422033 — 411023432 — 412421033 + 412023431 + 413421032 — 413022031 (2.8.1) 
1 2 3 
For example, if A= |2 3 1] then 
1 2 2 


det(A) =|A| =1- = 4— 2(3) +.3(1) =1. 


Exercise 2.8.5 = 1. Find the determinant of the following matrices. 


12 7 8 3 5 2 1 
} | | fae 
, 10 4 3 2 _.. |O 2 O a 
i) , it) , wi) |1 b BP 
002 3 6 -—7 1 2 
Cc Cc 
0 0 0 5 2 0 3 0 


2. Show that the determinant of a triangular matrix is the product of its diagonal entries. 


Definition 2.8.6 A matrix A is said to be a singular matrix if det(A) = 0. It is called non-singular if 
det(A) 4 0. 


The proof of the next theorem is omitted. The interested reader is advised to go through Appendix 
14.3 


Theorem 2.8.7 Let A be an n x n matrix. Then 
1. if B is obtained from A by interchanging two rows, then det(B) = — det(A), 
2. if B is obtained from A by multiplying a row by c then det(B) = cdet(A), 
3. if all the elements of one row or column are 0 then det(A) = 0, 


4. if B is obtained from A by replacing the jth row by itself plus & times the ith row, where 7 4 7 then 
det(B) = det(A), 


5. if A is a square matrix having two rows equal then det(A) = 0. 
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Remark 2.8.8 1. Many authors define the determinant using “Permutations.” It turns out that THE 
WAY WE HAVE DEFINED DETERMINANT is usually called the expansion of the determinant along 
the first row. 


2. Part [I] of Lemma implies that “one can also calculate the determinant by expanding along 
any row.” Hence, for ann x n matrix A, for every k, 1 < k <n, one also has 


n 


det(A) = $0 (-1)* ag; det (A(kZ)). 


j=l 


Remark 2.8.9 = 1. Let u’ = (ui, u2) and v’ = (v1, v2) be two vectors in R?. Then consider the par- 
allelogram, PQRS, formed by the vertices {P = (0,0)',Q =u,S =v,R=u+4+v}. We 


U U 
det (| i ) = |wive — ugry|. 
ug v2 


Recall that the dot product, we v = u,v, + Ugv2, and /ueu = \/(u? + u3), is the length of the 
vector u. We denote the length by €(u). With the above notation, if 0 is the angle between the 


Claim: Area (PQRS) = 


vectors u and v, then 
uev 


cos(6) = Tayiwy’ 


Which tells us, 
Area(PQRS) = 0(u)é(v) sin(6) = C(u)e(v) 1- (7) 
= Jia +o — eve = (Win — wp 


=> |u1 V2 — u2v3 |. 


Hence, the claim holds. That is, in R?, the determinant is + times the area of the parallelogram. 


2. Let u = (uj, U2, ug), V = (v1, V2, 3) and w = (w}, w2, w3) be three elements of R*°. Recall that the 


cross product of two vectors in R? is, 

UX V = (Ugu3 — Ugv2, UZU1 — U1U3, U1V2 — U2U1). 
Note here that if A = [u’, v’, w‘], then 

Uy Vr Wy 

det(A) = ]ug vo wel =ue(v x w)=ve(wx u)=we(ux v). 

U3 V3 W3 
Let P be the parallelopiped formed with (0,0,0) as a vertex and the vectors u,v,w as adjacent 
vertices. Then observe that u x v is a vector perpendicular to the plane that contains the paral- 
lelogram formed by the vectors u and v. So, to compute the volume of the parallelopiped P, we 


need to look at cos(@), where @ is the angle between the vector w and the normal vector to the 
parallelogram formed by u and v. So, 


volume (P) = |we(u x v)|. 
Hence, | det(A)| = volume (P). 


3. Let wi, U2,...,Un € R"*! and let A = [uy,u2,...,Un] be an n x n matrix. Then the following 
properties of det(A) also hold for the volume of an n-dimensional parallelopiped formed with 


0 € R”*! as one vertex and the vectors U1, U2,...,Un as adjacent vertices: 
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(a) If u, = (1,0,...,0)’, ug = (0,1,0,...,0)’,..., and un = (0,...,0,1)*, then det(A) = 1. Also, 
volume of a unit n-dimensional cube is 1. 


b) If we replace the vector u; by au;, for some a € R, then the determinant of the new matrix 
y 
is a- det(A). This is also true for the volume, as the original volume gets multiplied by a. 


(c) If u; = u; for some i, 2<i<n, then the vectors uj, U2,...,Uy will give rise to an (n — 1)- 
dimensional] parallelopiped. So, this parallelopiped lies on an (n — 1)-dimensional hyperplane. 
Thus, its n-dimensional volume will be zero. Also, | det(A)| = |0| = 0. 


In general, for any n x n matrix A, it can be proved that |det(A)| is indeed equal to the volume 
of the n-dimensional parallelepiped. The actual proof is beyond the scope of this book. 
2.8.1 Adjoint of a Matrix 


Recall that for a square matrix A, the notations A;; and Cj; = (—1)'t!A;; were respectively used to 
denote the (i, jth minor and the (i, jth cofactor of A. 


Definition 2.8.10 (Adjoint of a Matrix) Let A be an n x n matrix. The matrix B = [b,,;] with bj; = Cj, 
for 1 < i,j < nis called the Adjoint of A, denoted Adj(A). 


1 2 3 4 2 —-7 
Example 2.8.11 Let A= |2 3 1]. Then Adj(A)=]-3 -1 5]; 
1 2 2 1 0 -l 


as Ci => (-1)*1Ay = 4, Cie => (—1)1*7 Ajo = —3, C13 = (—1)173.Aj3 = 1. and so on. 


Theorem 2.8.12 Let A be an n x n matrix. Then 


1. for 1 < a < n, > aij Ci; = > aj;(—1)"*9 Ajj = det(A), 
j= j=l 
2. for i x £, > aiz Co; = > aj;(—1)**9 Ag; = 0, and 
j=l j=l 


3. A(Adj(A)) = det(A)I,. Thus, 


det(A) 40 > A7~t = ——~ Adj(A). (2.8.2) 


Proor. Let B= [b;;] be a square matrix with 
© the (¢) pow of B as the i*) row of A, 
e the other rows of B are the same as that of A. 


By the construction of B, two rows (ith and eth) are equal. By Part 5]of Lemma[2.8.7] det(B) = 0. By 
construction again, det(A(é|j)) = det(B(é¢|j)) for 1 <j <n. Thus, by Remark 22.8.8] we have 


n 


D(-D bes det (B(é\j)) = $>(-1) ai; det (B(é|3)) 


j=l 


0 = det(B) 


n 


= yaya det (A(é|7)) = >, aijzCe;. 
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Now, 
aj k=1 k=1 
7 0 ififj 
= det(A) ifi=j 
Thus, A(Adj(A)) = det(A)I,. Since, det(A) 4 0, Array Adi) = [,. Therefore, A has a right 
e 
inverse. Hence, by Theorem[2.7.8] A has an inverse and 
oo Adj(A) 
~‘det(Ay 
1 -1 0 
Example 2.8.13 Let d= |0 1 1]. Then 
1 2 1 
-1 1 -1 
Adj(A)=]1 1 -1 
-1 -3 1 


1/2 -1/2 1/2 
and det(A) = —2. By Theorem[2.8.12]3] A~' = |-1/2 -1/2 1/2 
1/2 3/2 1/2 


The next corollary is an easy consequence of Theorem [2.8.12] (recall Theorem 2.7.8). 


Corollary 2.8.14 If A is a non-singular matrix, then 
i uw det(A) ifj=k 

Adj(A)) A = det(A)I, d i; Cik = 

(Ma A)A= alain and Say ce = | NM TATE 


Theorem 2.8.15 Let A and B be square matrices of order n. Then det(AB) = det(A) det(B). 


Proor. Step 1. Let det(A) 40. 

This means, A is invertible. Therefore, either A is an elementary matrix or is a product of elementary 
matrices (see Theorem[2.7.7). So, let £1, E2,...,E be elementary matrices such that A = E, Ep--- Ep. 
Then, by using Parts [I] 2] and [4] of Lemma[2.8.7]repeatedly, we get 


det(AB) det(E) E, se4 EB) = det (£1) det (E> sang EB) 
det(£1) det (£2) det (E3 tee EB) 


= det(£; E2) det (E3 nasa E,B) 


= det(E) E>. eee Ex) det(B) 
= det(A) det(B). 


Thus, we get the required result in case A is non-singular. 
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Step 2. Suppose det(A) = 0. 


Then A is not invertible. Hence, there exists an invertible matrix P such that PA = C, where C = a 
So, A = P~'!C, and therefore 
-1 —1 —1 (1B 
det(AB) = det((P-°C)B) = det(P~*(CB)) = det | P 
B 
= det(P~')- det ( C1 as P~' is non-singular 
= det(P)-0=0=0- det(B) = det(A) det(B). 
Thus, the proof of the theorem is complete. 
Corollary 2.8.16 Let A be a square matrix. Then A is non-singular if and only if A has an inverse. 
1 
PRooF. Suppose A is non-singular. Then det(A) 4 0 and therefore, A~! = ———Adj(A). Thus, A 


~ det(A) 
has an inverse. 
Suppose A has an inverse. Then there exists a matrix B such that AB = I = BA. Taking determinant 
of both sides, we get 
det (A) det(B) = det(AB) = det(I) = 1. 


This implies that det(A) 4 0. Thus, A is non-singular. 


Theorem 2.8.17 Let A be a square matrix. Then det(A) = det(A‘). 


Proor. If A is a non-singular Corollary 2.8.14] gives det(A) = det(A‘). 
If A is singular, then det(A) = 0. Hence, by Corollary [2.8.16] A doesn’t have an inverse. There- 

fore, A* also doesn’t have an inverse (for if A’ has an inverse then A~! = ((At)-1)"). Thus again by 

Corollary 2.8.16] det(A‘) = 0. Therefore, we again have det(A) = 0 = det(A’). 
Hence, we have det(A) = det(A‘). 


2.8.2 Cramer’s Rule 


Recall the following: 
e The linear system Ax = b has a unique solution for every b if and only if A~! exists. 
e A has an inverse if and only if det(A) 4 0. 


Thus, Ax = b has a unique solution FOR EVERY b if and only if det(A) 4 0. 
The following theorem gives a direct method of finding the solution of the linear system Ax = b 
when det(A) 4 0. 


Theorem 2.8.18 (Cramer’s Rule) Let Ax = b be a linear system with n equations in m unknowns. If 
det(A) 4 0, then the unique solution to this system is 


= det(A;) 
~ det(A)’ 


Lj for 7 =1,2,...,n, 


where A, is the matrix obtained from A by replacing the jth column of A by the column vector b. 
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1 
Proor. Since det(A) #0, A7t = aay Thus, the linear system Ax = b has the solution 
e 
1 
x= det(ay 10 AD. Hence, xj, the jth coordinate of x is given by 


ae b1C\; + b2C 2; fee Hf bnCnj _ det(A;) 
oe det(A) det (A) ’ 


The theorem implies that 


by a2 +++) Gin 
1 bz 22 ++: Qan 
r= 
det (A) , 
Dn An2 a Ann 
and in general 
Qi. oct*) G1j-1 by Q1j41 *'* Gin 
1 412 "°° G25j-1 i) Q2j4+1 °*'* Gan 
i 
7 det (A) 

Qin -"" Anj—-1 bn Qnj+1 ""° ann 


for 7 = 2,3,...,n. 


157283 1 

Example 2.8.19 Suppose that A = |2 3 1] and b= }1] . Use Cramer's rule to find a vector x such 
12s 2 1 

that Ax = b. 


1 2 3 
Solution: Check that det(A) = 1. Therefore x; =|1 3 1)=-l, 
1 2 2 
1 1.3 12 1 
go=)2 1 1) S1yand ag = 2 -3° 1) = 0: That is; x* = (—1,1, 0). 
1 ..2 12 1 


2.9 Miscellaneous Exercises 


Exercise 2.9.1 1. Let A be an orthogonal matrix. Show that det A = +1. 


2. If A and B are two n x n non-singular matrices, are the matrices A+ Band A-— B non-singular? 
Justify your answer. 


3. For an n xX n matrix A, prove that the following conditions are equivalent: 


2.9. 


10. 


11. 


12. 


13. 


14. 
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20 60 4 
Be 3: 2 DAN 
. Let A= ]2 5 7 5 5]. We know that the numbers 20604, 53227, 25755, 20927 and 78421 are 
209 2 7 
78 4 2 1 
all divisible by 17. Does this imply 17 divides det(A)? 
. Let A= [aijJnxn where aj; =a/~*. Show that det(A)= [] (x; —2;). [The matrix A is usually 


l<i<j<n 
called the Van-dermonde matrix.] 


. Let A = [a,j] with a;; = max{i, 7} be an n x n matrix. Compute det A. 
. Let A= [a,j] with a;; = 1/(i+ 7) be an n x n matrix. Show that A is invertible. 


. Solve the following system of equations by Cramer's rule. 


thatyt2z-w=l,e¢+y-zt+uw=2, a+tyt2-w=7, c+y4+24+uw=3. 
w)a-y+2—-weH=l,et+y-—ztw=2, w+y-—z-w=7,¢-y-ztw=3. 


. Suppose A = [a,j] and B = [b;;] are two n x n matrices such that bj; = p’ /a;; for 1 < i,j <n for 


some non-zero real number p. Then compute det(B) in terms of det(A). 


The position of an element a;; of a determinant is called even or odd according as 7+ 7 is even or odd. 
Show that 


(a) If all the entries in odd positions are multiplied with —1 then the value of the determinant doesn’t 
change. 


(b) If all entries in even positions are multiplied with —1 then the determinant 


i. does not change if the matrix is of even order. 


ii. is multiplied by —1 if the matrix is of odd order. 


Let A be an n x n Hermitian matrix, that is, A* = A. Show that det A is a real number. [A is a matrix 


with complex entries and A* = A?|] 


Let A be an n X n matrix. Then show that 


A is invertible <= Adj(A) is invertible. 
Let A and B be invertible matrices. Prove that Adj(AB) = Adj(B)Adj(A). 


B 
Let P= CO 4 be a rectangular matrix with A a square matrix of order n and |A| 4 0. Then show 


that rank (P) =n if and only if D=CA7!B. 
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CHAPTER 2. 


LINEAR SYSTEM OF EQUATIONS 


Chapter 3 
Finite Dimensional Vector Spaces 


Consider the problem of finding the set of points of intersection of the two planes 2x + 3y+z+u=0 
and 31+y+2z+u=0. 
Let V be the set of points of intersection of the two planes. Then V has the following properties: 


1. The point (0,0,0,0) is an element of V. 


2. For the points (—1,0, 1,1) and (—5, 1, 7,0) which belong to V; the point (—6, 1,8, 1) = (—1,0,1,1)+ 
(—5,1,7,0) eV. 


3. Let a € R. Then the point a(—1,0,1,1) = (—a,0,a, a) also belongs to V. 


Similarly, for an m x n real matrix A, consider the set V, of solutions of the homogeneous linear 


system Ax = 0. This set satisfies the following properties: 


1. If Ax = 0 and Ay = 0, then x,y € V. Thenx+y € V as A(x+y) = Ax+ Ay = 04+0=0. Also, 
x+y=ytx. 


2. It is clear that if x,y,z © V then (x+y)+z=x+(y+z). 
3. The vector 0 € V as AO = 0. 

4. If Ax = 0 then A(—x) = —Ax = 0. Hence, —x € V. 

5. Let a € R and x € V. Then ax € V as A(ax) = aAx = 0. 


Thus we are lead to the following. 


3.1 Vector Spaces 


3.1.1 Definition 


Definition 3.1.1 (Vector Space) A vector space over F, denoted V(F), is a non-empty set, satisfying the 


following axioms: 
1. VECTOR ADDITION: To every pair u, v € V there corresponds a unique element u@ v in V such that 


(a) u®v =v @ u (Commutative law). 
(b) (u@ v) Pw =u (v G w) (Associative law). 


(c) There is a unique element O in V (the zero vector) such that u@ O = u, for every u € V (called 
the additive identity). 


AQ 
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(d) For every u € V there is a unique element —u € V such that u@ (—u) = 0 (called the additive 
inverse). 


@ is called VECTOR ADDITION. 


2. SCALAR MULTIPLICATION: For each u € V and a € F, there corresponds a unique element a© u in 
V such that 


(a) a: (8 ©u) = (a8) © u for every a, 3 € F and ué€ V. 
(b) 1©u=u for every u € V, where 1 ER. 


3. DISTRIBUTIVE LAWS: RELATING VECTOR ADDITION WITH SCALAR MULTIPLICATION 
For any a, 2 € F and u,v € V, the following distributive laws hold: 


(a) a© (u@v) =(aO©u) @ (av). 
(b) (a+ 8)Ou=(aOu) @ (BOu). 


Note: the number 0 is the element of F whereas 0 is the zero vector. 


Remark 3.1.2 The elements of F are called SCALARS, and that of V are called VECTORS. If F = R, the 
vector space is called a REAL VECTOR SPACE. If F = C, the vector space is called a COMPLEX VECTOR 
SPACE. 

We may sometimes write V for a vector space if F is understood from the context. 


Some interesting consequences of Definition is the following useful result. Intuitively, these 
results seem to be obvious but for better understanding of the axioms it is desirable to go through the 
proof. 


Theorem 3.1.3 Let V be a vector space over F. Then 
1. u®v =u implies v = 0. 
2. a©u= 0 if and only if either u is the zero vector or a = 0. 
3. (-1) Ou = —u for every u€ V. 


PROOF. Proof of Part [I] 
For u € V, by Axiom [Id] there exists —u € V such that -u@u=0. 
Hence, u @ v = u is equivalent to 


u@(u@v) =—-u@u => (-u@u) 6v=0 0gév=0 v=0. 


Proof of Part 2] 
As 0=060, using the distributive law, we have 


a®0=ae(080)=(ad) @ (aO0). 
Thus, for any a € F, the first part implies a© 0 = O. In the same way, 
0Ou=(04+0)Ou=(00u) G(00Uu). 
Hence, using the first part, one has 0 © u = O for any ue V. 
Now suppose a © u = 0. If a = 0 then the proof is over. Therefore, let us assume a 4 0 (note that 
1 
a is areal or complex number, hence — exists and 
a 


1 1 1 
0=—0©0=—O0(aGu)=(—a)Ou=10u=Uu 
a a a 
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as 1© u= u for every vector u € V. 
Thus we have shown that ifa £0 anda®u=O0Othenu=0. 
Proof of Part B] 

We have 0 = Ou = (1 + (—1))u =u (—1)u and hence (—1)u = —u. 


3.1.2. Examples 


Example 3.1.4 1. The set R of real numbers, with the usual addition and multiplication (i.e., @ = + 


and © =-) forms a vector space over R. 
2. Consider the set R? = {(a1, 22) : 21,22 € R}. For 71,72, 41, ye € R and a € R, define, 
(v1, 22) © (y1, y2) = (1 +y1,%2 + y2) and a© (x1, 22) = (ax, axe). 
Then R? is a real vector space. 


3. Let R” = {(a1,a2,...,@n) : a; € R,1 < i < n}, be the set of n-tuples of real numbers. For 
u = (a),.--,@n), V = (b1,..-, bn) in V and a € R, we define 


u@®v = (a, + b1,..-, an + dn) and a@©u= (adj,...,AQn) 


(called component wise or coordinate wise operations). Then V is a real vector space with addition and 
scalar multiplication defined as above. This vector space is denoted by R”, called the real vector 
space of n-tuples. 


4. Let V = R® (the set of positive real numbers). This is NOT A VECTOR SPACE under usual operations of 
addition and scalar multiplication (why?). We now define a new vector addition and scalar multiplication 
as 


Vi ®ve =V1i-Vve and a®v=v" 
for all v1, v2, v € R™ and aE R. Then Rt? is a real vector space with 1 as the additive identity. 


5. Let V = R?. Define (21,22) ® (y1, yo) = (41 +y1 +:1,22 + yo —3), AO (21,22) = (at, +a—- 
1,axv2 — 3a + 3) for (#1, 22), (yi, y2) € R? and a € R. Then it can be easily verified that the vector 


(—1,3) is the additive identity and V is indeed a real vector space. 


Recall ./—1 is denoted 7. 


6. Consider the set C = {x + iy: x,y € R} of complex numbers. 


(a) For 21 + iyi, 22 + iyo € C and aE R, define, 


(a1 + ty1) B (wa + tye) = (%1 +22) +i(yr + ye) and 
aQ@(r+iyi) = (axv,)+i(ayr). 


Then C is a real vector space. 


(b) For x1 + iyi, 22 + tyg © C and a+ if € C, define, 


(21 + iy1) ® (wo + ty2) = (a1 +42)+i(yr + y2) and 
(a+i8)O(ai+im) = (aa — By) +i(ays + Bx). 


Then C forms a complex vector space. 
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7. 


10. 


11. 
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Consider the set C” = {(z1, 22,..-,2n) : 2) € C for 1 <i <n}. For (21,...,2n), (wi,---,Wn) € C” 
and a € F, define, 


(215.6652n) ® (Wiy--5Wn) = (1 tw1y.-y2n+ Wn) and 


QO (21,-.-,2n) = (A21,..., Zn). 


(a) If the set F is the set C of complex numbers, then C” is a complex vector space having n-tuple 
of complex numbers as its vectors. 


(b) If the set F is the set R of real numbers, then C” is a real vector space having n-tuple of complex 


numbers as its vectors. 


Remark 3.1.5 In Example the scalars are Complex numbers and hence i(1,0) = (7,0). 
Whereas, in Example the scalars are Real Numbers and hence WE CANNOT WRITE i(1,0) = 


(i, 0). 


. Fix a positive integer n and let 1/,,(IR) denote the set of all m x n matrices with real entries. Then 


M,,(R) is a real vector space with vector addition and scalar multiplication defined by 


A® B= [aij] © [bij] = [aij + 855], a@®@ A=a6 [ay] = [aayy]. 


. Fix a positive integer n. Consider the set, P,,(IR), of all polynomials of degree < n with coefficients 


from R in the indeterminate x. Algebraically, 
Pr(R) = {a9 + aye + age? +--+ +an2": a; €R,O<i <n}. 


Let f(x), g(x) € Pn(R). Then f(x) = ag + aia + agx? + +--+ anx” and g(x) = bo + bie + box? + 
++» +b,2” for some a;,b; € R, 0 <i <n. It can be verified that P,,(IR) is a real vector space with the 
addition and scalar multiplication defined by: 


f(x) g(a) = (ao +bo) + (a1 + bi)a +--+ + (an + bn)", and 


a® f(r) = aagtaayx+--:+aa,x2" foraeR. 


Consider the set P(R), of all polynomials with real coefficients. Let f(x), g(a) € P(IR). Observe that 
a polynomial of the form ag + aya +--+ + @m2™ can be written as ag + a,2 +--+ ay2"+0- 
etl 4...+40-2? for any p > m. Hence, we can assume f(x) = a9 + 12 + agz? +--+ + a,x? and 
g(x) = bo + dix + box? +--+ + bpx? for some a;,b; € R, 0 <i < p, for some large positive integer p. 
We now define the vector addition and scalar multiplication as 


f(x) ® g(@) 
a © f(x) 


(ao + bo) + (a1 + b1)@ +--+ + (Gp + bp)a?, and 


ado + aayx+---+adpx? foraeR. 


Then P(R) forms a real vector space. 


Let C([—1,1]) be the set of all real valued continuous functions on the interval [—1,1]. For f,g € 
C({—1,1]) and a € R, define 


(f@g)(z) = f(x)+ g(a), and 
(aOf)(z) = af(z), for all x € {[-1,1]. 


Then C([—1,1]) forms a real vector space. The operations defined above are called POINT WISE 
ADDITION AND SCALAR MULTIPLICATION. 
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12. Let V and W be real vector spaces with binary operations (+,¢) and (@,©), respectively. Consider 
the following operations on the set V x W : for (x1, y1), (x2, y2) € V x W and aE R, define 


(X1,y1) ©’ (k2,y2) = (x1+X2,y1 @y2), and 
ao(X1,y1) = (aex1,a0yi). 


On the right hand side, we write x; + x2 to mean the addition in V, while y; @ yg is the addition in 
W. Similarly, ~@e x, and a © y1 come from scalar multiplication in V and W, respectively. With the 
above definitions, V x W also forms a real vector space. 


The readers are advised to justify the statements made in the above examples. 


From now on, we will use ‘u+ v’ in place of ‘u@ v’ and ‘a-u or aw’ in place of ‘a © wu’. 


3.1.3 Subspaces 


Definition 3.1.6 (Vector Subspace) Let S be a NON-EMPTY SUBSET of V. S(IF) is said to be a subspace 
of V(F) if au+ Gv © S whenever a, 3 € F and u, v € S; where the vector addition and scalar multiplication 
are the same as that of V(F). 


Remark 3.1.7 Any subspace is a vector space in its own right with respect to the vector addition and 
scalar multiplication that is defined for V(F). 


Example 3.1.8 = 1. Let V(IF) be a vector space. Then 
(a) S = {O}, the set consisting of the zero vector 0, 
(b) S=V 
are vector subspaces of V. These are called trivial subspaces. 


2. Let S = {(2,y,z) € R?: x +y—z = 0}. Then S is a subspace of R?. (S is a plane in R® passing 
through the origin.) 


3. Let S = {(a,y,z) € R?: x+y+4+ 2 = 3}. Then S is not a subspace of R°. (9 is again a plane in R? 
but it doesn't pass through the origin.) 


4. Let S = {(z,y,z) € R?: z=}. Then S is a subspace of R®. 


5. The vector space P,,(IR) is a subspace of the vector space P(R). 


Exercise 3.1.9 = 1. Which of the following are correct statements? 


(a) Let S = {(z,y,z) € R?: z= 27}. Then S is a subspace of R?. 

(b) Let V(IF) be a vector space. Let x € V. Then the set {ax : a € F} forms a vector subspace of V. 

(c) Let W = {f € C([-1,1]) : f(1/2) = 0}. Then W is a subspace of the real vector space, 
C([-1, 1). 


2. Which of the following are subspaces of R”(R)? 


(a) {(21,2,.-.,%n) £1 2 Of. 

(b) {(x1,22,.-.,@n) i 21 + 2ag = 4ar3}. 
(c) {(21,%2,...,%n) : £118 rational }. 
(d) {(a1,¥2,.-., an) 1 = 23}. 
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(e) {(@1,22,...,%n) : either x1 or x2 or both is0}. 
(f) {(@1,@2,.--,%n) + lai] < 1}. 


3. Which of the following are subspaces of 1)C”(IR) it)C"(C)? 


(a) {(21, 22,---,2n) : Z1is real }. 
(b) {(21, 22,---,2n) : 21 + 22 = ZB}. 
(c) {(21, Za,---,2n) | 21 [=| Ze |}. 


3.1.4 Linear Combinations 


Definition 3.1.10 (Linear Span) Let V (IF) be a vector space and let S = {uj, u2,...,u,} be a non-empty 
subset of V. The linear span of S' is the set defined by 


L(S) = {au + agug+-:-+anu,:a; €F,1<i<n} 
If S is an empty set we define L(S) = {0}. 


Example 3.1.11 1. Note that (4, 5,5) is a linear combination of (1, 0,0), (1, 1,0), and (1,1, 1) as (4,5,5) = 
B(1,1,4) = 11030) 004, 0): 
For each vector, the LINEAR COMBINATION IN TERMS OF THE VECTORS (1,0,0),(1,1,0), AND 
(1, 1,1) IS UNIQUE. 


2. Is (4,5,5) a linear combination of (1, 2,3), (—1,1,4) and (3,3, 2)? 
Solution: We want to find a1, a@2,a3 € R such that 


a (1, 2,3) + a2(—1, 1,4) + 03(3, 3,2) = (4,5, 5). (3.1.1) 


Check that 3(1, 2,3) +(—1)(—1, 1, 4) +0(8, 3, 2) = (4, 5, 5). Also, in this case, the vector (4, 5,5) DOES 
NOT HAVE A UNIQUE EXPRESSION AS LINEAR COMBINATION OF VECTORS (1, 2,3), (—1,1,4) AND 
(3, 3, 2). 


3. Verify that (4,5, 5) is not a linear combination of the vectors (1, 2,1) and (1, 1,0)? 
4. The linear span of S' = {(1,1,1), (2,1,3)} over R is 
L(S) = {a(1,1,1) + 6(2,1,3):a,6 € R} 


= {(a+26,a+ B,a+38):a,8 € R} 
= {(x,y,2) € R? : 22-—y =z}. 


as 2(a + 28) — (a+ B) = a4 36, and if z = 2x — y, take a= 2y—a and B=a2—-y. 


Lemma 3.1.12 (Linear Span is a subspace) Let V (IF) be a vector space and let S be a non-empty subset 
of V. Then L(S) is a subspace of V(F). 


Proor. By definition, S C L(S) and hence L(S) is non-empty subset of V. Let u,v € L(S). Then, for 
1<i<n there exist vectors w; € S, and scalars a;, 3; € F such that u = ayw, + agw2 +---+QnWn 
and v = 6, w, + bow2 +---+ Brawn. Hence, 


u+v = (a, + 8)wi +--+ + (an + Bn)Wn € L(S). 


Thus, £(S') is a vector subspace of V(F). 
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Remark 3.1.13 Let V(F) be a vector space and W C V be a subspace. If S C W, then L(S) C W isa 
subspace of W as W is a vector space in its own right. 


Theorem 3.1.14 Let S be a non-empty subset of a vector space V. Then L(S) is the smallest subspace of 
V containing S. 


Proor. For every u € S, u = 1.u € L(S) and therefore, S C L(S). To show L(S) is the smallest 
subspace of V containing S, consider any subspace W of V containing S. Then by Proposition B.1.13] 
L(S) C W and hence the result follows. 


Definition 3.1.15 Let A be an m x n matrix with real entries. Then using the rows aj ,a‘,...,a‘, € R” 


om 


and columns bj, bg,...,b, € R™, we define 
1. RowS'pace(A) = L(aj,a2,...,am), 
2. ColumnS'pace(A) = L(bi, b2,..., bn), 
3. NullSpace(A), denoted V’(A) as {x' € R” : Ax = O}. 
4. Range(A), denoted Im (A) = {y : Ax = y for some x‘ € R"}. 


Note that the “column space” of a matrix A consists of all b such that Ax = b has a solution. Hence, 
ColumnS'pace(A) = Range(A). 


Lemma 3.1.16 Let A be a real m x n matrix. Suppose B = EA for some elementary matrix E. Then 
Row Space(A) = Row Space(B). 


PrRooF. We prove the result for the elementary matrix E;;(c), where c 4 0 andi < j. Let aj,a$,...,a 
be the rows of the matrix A. Then B = Ej;(c)A gives us 


Row Space(B) L(ay,...,aj—1, aj + Caj,..., am) 


{ajay +--+ + aj-1a;-1 + a; (a; +ca;) +--: 
+Amam: ace R,1<l<m} 


m 
{Soacne taem sare Rs tm 


l=1 

= {So fem: fee Rie tem 
f=1 

= L(ay,...,Aj—1, Ai,---,;Am) 


l| 


Row Space(A) 


Theorem 3.1.17 Let A be an m x n matrix with real entries. Then 
1. N(A) is a subspace of R”; 


2. the non-zero row vectors of a matrix in row-reduced form, forms a basis for the row-space. Hence 
dim( Row Space(A)) = row rank of (A). 
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Proor. Part 1) can be easily proved. Let A be an m x n matrix. For part 2), let D be the row-reduced 


form of A with non-zero rows d{,d$,...,d£. Then B = Ey.F,_ +--+ E2E,A for some elementary matrices 
E,, Eo,..., Ey. Then, a repeated application of LemmaB.1.16limplies Row Space(A) = Row Space(B). 
That is, if the rows of the matrix A are aj,a$,...,a/,, then 


L(aj,a2,...,am) = L(b,, bz,...,b,). 


Hence the required result follows. 


Exercise 3.1.18 = 1. Show that any two row-equivalent matrices have the same row space. Give examples 


to show that the column space of two row-equivalent matrices need not be same. 
2. Find all the vector subspaces of R?. 


3. Let P and Q be two subspaces of a vector space V. Show that PQ is a subspace of V. Also show 
that PU@Q need not be a subspace of V. When is PU Q a subspace of V? 


4. Let P and Q be two subspaces of a vector space V. Define P+ Q = {u+v:ue Pv e€ Q}. Show 
that P + Q is a subspace of V. Also show that L(PUQ) = P+Q. 


5. Let S = {a1,2,x3,24} where x = (1,0,0,0), v2 = (1,1,0,0), v3 = (1,2,0,0), v4 = (1,1,1,0). 
Determine all x; such that L(S') = L(S \ {a;}). 


6. Let C([—1, 1}) be the set of all continuous functions on the interval [—1, 1] (cf. ExampleB.1.4]11). Let 
W, = {f €C((-1,1]): f(0.2) =0}, and 
W. = {f¢C([-1,1]): i'(Dexists }. 
Are W1, W2 subspaces of C({—1,1])? 


7. Let V = {(a,y) : 2, y € R} over R. Define (x,y) © (a1, y1) = (a + 21,0) and a © (a, y) = (az, 0). 
Show that V is not a vector space over R. 


8. Recall that M,,(IR) is the real vector space of all n x n real matrices. Prove that the following subsets 


are subspaces of /,,(R). 


(a) sl, ={A€ M,(R) : trace(A) = 0} 
(b) Sym, ={AeM,(R) : A= A‘} 
(c) Skew, = {A € M,(R) : 4+ At =0} 
9. Let V = R. Definer @y = 2-—y and a© x = —az. Which vector space axioms are not satisfied here? 


In this section, we saw that a vector space has infinite number of vectors. Hence, one can start with 
any finite collection of vectors and obtain their span. It means that any vector space contains infinite 
number of other vector subspaces. Therefore, the following questions arise: 


1. What are the conditions under which, the linear span of two distinct sets the same? 


2. Is it possible to find/choose vectors so that the linear span of the chosen vectors is the whole vector 
space itself? 


3. Suppose we are able to choose certain vectors whose linear span is the whole space. Can we find 
the minimum number of such vectors? 


We try to answer these questions in the subsequent sections. 
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3.2 Linear Independence 


Definition 3.2.1 (Linear Independence and Dependence) Let S = {uj, ug,...,U,,} be any non-empty 
subset of V. If there exist some non-zero a;'s 1 <%< m, such that 


1 Uy + AQUg +--+ +AmnUm = O, 


then the set S is called a linearly dependent set. Otherwise, the set S is called linearly independent. 


Example 3.2.2 1. Let S = {(1,2,1), (2, 1,4), (3, 3,5)}. Then check that 1(1, 2, 1) +1(2, 1,4)+(—1)(3, 3,5) = 
(0, 0,0). Since ay = 1,2 = 1 and a3 = —1 is a solution of (8.2.1), so the set S is a linearly dependent 
subset of R°. 


2. Let S = {(1,1,1), (1, 1,0), (1,0, 1)}. Suppose there exists a, 3, y € R such that a(1,1,1)+6(1,1,0)+ 
y(1,0,1) = (0,0,0). Then check that in this case we necessarily have a = 8 = y = 0 which shows 
that the set S = {(1,1,1), (1, 1,0), (1,0, 1)} is a linearly independent subset of R®. 


In other words, if S = {uy,ug,...,Um} is a non-empty subset of a vector space V, then to check 
whether the set S is linearly dependent or independent, one needs to consider the equation 


ayuy + Q2U2 +--+ +A nUm = 0. (3.2.1) 


In case ay a2 vee Qm 0 is THE ONLY SOLUTION of (8.2.1), the set S becomes a linearly 


independent subset of V. Otherwise, the set S becomes a linearly dependent subset of V. 


Proposition 3.2.3 Let V be a vector space. 
1. Then the zero-vector cannot belong to a linearly independent set. 
2. If S is a linearly independent subset of V, then every subset of S' is also linearly independent. 
3. If S is a linearly dependent subset of V then every set containing S' is also linearly dependent. 


ProoF. We give the proof of the first part. The reader is required to supply the proof of other parts. 


Let S = {0 = uj, Up,...,u,} be a set consisting of the zero vector. Then for any y 4 0, yu; + oug + 
---+0u,, = 0. Hence, for the system a ,u, + agu2g+--:+QnUm = 0, we have a non-zero solution a; = 7 
and 0 = ag =::: = an. Therefore, the set S$ is linearly dependent. 

Theorem 3.2.4 Let {vi,v2,...,Vp} be a linearly independent subset of a vector space V. Suppose there 
exists a vector Vp+41 € V, such that the set {v1, V2,..., Vp, Vp+1} is linearly dependent, then v,+1 is a linear 
combination of v1, V2,.-.,Vp- 

PROOF. Since the set {v1, V2,..-,Vp,Vp+i} is linearly dependent, there exist scalars a1, a2,...,Qp+1, 


NOT ALL ZERO such that 


Q1V1 + Q2gV2 + +++ + ApVp + Ap4i1Vp41 = 0. (3.2.2) 


CLAIM: Qp41 F 0. 

Let if possible api; = 0. Then equation gives a1Vv] + Qev2 +--+ + apvp = O with not all 
aj, 1<1i<p zero. Hence, by the definition of linear independence, the set {v1,v2,...,Vp} is linearly 
dependent which is contradictory to our hypothesis. Thus, ap; 4 0 and we get 


Vpt1 = — (aiv1 +++: + QpVp). 


Ap+1 
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Note that a; € F for every i, 1 <i<p+1 and hence -— € F for 1 <i < p. Hence the result follows. 


Ap+1 


We now state two important corollaries of the above theorem. We don’t give their proofs as they are 


easy consequence of the above theorem. 


Corollary 3.2.5 Let {uj, u2,...,u,} be a linearly dependent subset of a vector space V. Then there exists 
a smallest k, 2 << k <n such that 


L(uj,ug,...,ug) = L(ui, ug,..., Up—1). 


The next corollary follows immediately from Theorem and Corollary 


Corollary 3.2.6 Let {vi,v2,...,Vp} be a linearly independent subset of a vector space V. Suppose there 
exists a vector v € V, such that v ¢ L(vi,V2,...,Vp). Then the set {vi1,v2,...,Vp, Vv} is also linearly 


independent subset of V. 


Exercise 3.2.7. 1. Consider the vector space R?. Let u; = (1,0). Find all choices for the vector uz such 
that the set {uj, ug} is linear independent subset of R?. Does there exist choices for vectors uz and 
uz such that the set {u;, U2, us} is linearly independent subset of R?? 


2. If none of the elements appearing along the principal diagonal of a lower triangular matrix is zero, show 


that the row vectors are linearly independent in R”. The same is true for column vectors. 


3. Let S = {(1,1,1,1), (1, -1,1, 2), (1,1, -1,1)} C R*. Determine whether or not the vector (1, 1,2,1) € 
L(S)? 


4. Show that S = {(1, 2,3), (—2, 1,1), (8,6, 10)} is linearly dependent in R®. 


5. Show that S = {(1,0,0), (1, 1,0), (1,1, 1)} is a linearly independent set in R®. In general if {f1, fo, f3} 
is a linearly independent set then {f1, f1 + fo, fi + fo + fs} is also a linearly independent set. 


6. In IR, give an example of 3 vectors u,v and w such that {u,v,w} is linearly dependent but any set 


of 2 vectors from u, v, w is linearly independent. 
7. What is the maximum number of linearly independent vectors in R°? 
8. Show that any set of k vectors in R? is linearly dependent if k > 4. 
9. Is the set of vectors (1,0), ( i,0) linearly independent subset of C? (IR)? 


10. Under what conditions on a@ are the vectors (1 + a,1 — a) and (a — 1,1 + a) in C?(R) linearly 
independent? 


11. Let u,v € V and M bea subspace of V. Further, let kK be the subspace spanned by M and u and H 
be the subspace spanned by M and v. Show that if v € kK andv ¢ M thenue H. 


3.3. Bases 


Definition 3.3.1 (Basis of a Vector Space) 1. A non-empty subset G of a vector space V is called a 
basis of V if 


(a) B is a linearly independent set, and 


(b) L(B) = V, i.e., every vector in V can be expressed as a linear combination of the elements of B. 
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2. A vector in B is called a basis vector. 


Remark 3.3.2 Let {v1,v2,...,Vp} be a basis of a vector space V(F). Then any v € V is a UNIQUE 
LINEAR COMBINATION of the basis vectors, V1, V2,.-., Vp- 

Observe that if there exists av € W such that v = a,V1 + Q2V2+:::+Qp)Vp and v = §1v1 + Bove + 
+++ + Byvp then 


0O=v—v= (a1 — fi)vi + (a2 — B2)v2 ++++ + (Ay — Bp)Vp- 
But then the set {v1,v2,...,Vp} is linearly independent and therefore the scalars a; — 6; for1 <i <p 


must all be equal to zero. Hence, for 1 <i< p, a; = 6; and we have the uniqueness. 


By convention, the linear span of an empty set is {0}. Hence, the empty set is a basis of the vector 
space {O}. 


Example 3.3.3. 1. Check that if V = {(z,y,0): z,y € R} C R°, then B = {(1,0,0),(0,1,0)} or 
B = {(1,0,0), (1,1,0)} or B = {(2,0,0), (1,3,0)} or --+ are bases of V. 


2. Forl<i<n, let e; = (0,...,0, 1 ,0,...,0) € R”. Then, the set B = {e1,e2,...,e,} forms 


i th place 
a basis of R”. This set is called the standard basis of R”. 


That is, if n = 3, then the set {(1,0,0), (0, 1,0), (0,0, 1)} forms an standard basis of R°. 


3. LetV ={(2,y, z): at+y—z =0, x,y,z € R} bea vector subspace of R°?. Then S = {(1, 1, 2), (2, 1,3), (1,2,3)} C 
V. It can be easily verified that the vector (3,2,5) € V and 


(3, 2,5) = (1,1, 2) + (2, 1,3) = 4(1, 1, 2) — (4, 2,3). 


Then by Remark B.3.2] S cannot be a basis of V. 


A basis of V can be obtained by the following method: 
The condition x + y — z = 0 is equivalent to z = « + y. we replace the value of z with «+ y to get 


(x,y,z) = (a, y,2+y) = (2,0, xz) + (0,y, y) = 2(1, 0,1) + y(0, 1, 1). 


Hence, {(1,0, 1), (0,1, 1)} forms a basis of V. 


4. Let V = {a+ ib: a,b € R} and F =C. That is, V is a complex vector space. Note that any element 
a+ib€ V can be written as a + ib = (a + ib)1. Hence, a basis of V is {1}. 


5. Let V = {a+ib: a,b € R} and F=R. That is, V is a real vector space. Any element a + ib € V is 
expressible as a- 1+ b-i. Hence a basis of V is {1, i}. 


Observe that @ is a vector in C. Also, 1 ¢ R and hence i: (1 + 0-2) is not defined. 
6. Recall the vector space P (IR), the vector space of all polynomials with real coefficients. A basis of this 


vector space is the set 


tN, as OP yeni HO ase de 
This basis has infinite number of vectors as the degree of the polynomial can be any positive integer. 
Definition 3.3.4 (Finite Dimensional Vector Space) A vector space V is said to be finite dimensional if 


there exists a basis consisting of finite number of elements. Otherwise, the vector space V is called infinite 


dimensional. 


In Example[3.3.3] the vector space of all polynomials is an example of an infinite dimensional vector 
space. All the other vector spaces are finite dimensional. 
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Remark 3.3.5 We can use the above results to obtain a basis of any finite dimensional vector space V 
as follows: 


Step 1: Choose a non-zero vector, say, v1 € V. Then the set {v1} is linearly independent. 


Step 2: If V = L(v1), we have got a basis of V. Else there exists a vector, say, v2 € V such that 
v2 ¢ L(v1). Then by Corollary[3.2.6] the set {vi,v2} is linearly independent. 


Step 3: If V = L(vi,v2), then {vi, v2} is a basis of V. Else there exists a vector, say, v3 € V such 
that v3 ¢ L(vi, v2). So, by Corollary[3.2.6, the set {v1,V2,v3} is linearly independent. 


At the ith step, either V = L(vi,va,..., vi), or L(v1, V2,---, vi) ZV. 
In the first case, we have {v1,V2,..., Vi} as a basis of V. 
In the second case, L(v1,v2,..-,Vi) C V. So, we choose a vector, say, viz1 € V such that viz1 ¢ 
L(vi,V2,..-,Vvi). Therefore, by Corollary[3.2.6| the set {vi,V2,..., Vi+1} is linearly independent. 


This process will finally end as V is a finite dimensional vector space. 


Exercise 3.3.6 1. Let S={vi,v2,...,Vp} be a subset of a vector space V(F). Suppose L(S') = V but 
S is not a linearly independent set. Then prove that each vector in V can be expressed in more than 


one way as a linear combination of vectors from S. 
2. Show that the set {(1,0, 1), (1,%,0), (1,1,1—)} is a basis of C3(C). 


3. Let A be a matrix of rank 7. Then show that the r non-zero rows in the row-reduced echelon form of 


A are linearly independent and they form a basis of the row space of A. 


3.3.1 Important Results 


Theorem 3.3.7 Let {vi,v2,...,Vn} be a basis of a given vector space V. If {w1, Wa,..., Wm} is a set of 


vectors from V with m > n then this set is linearly dependent. 


PrRooFr. Since we want to find whether the set {W,,Wo2,...,Wm} is linearly independent or not, we 
consider the linear system 
QyWw, + agwe +:+:+AnWm = 0 (3.3.1) 


with a1, Q@2,...,Q@m as the m unknowns. If the solution set of this linear system of equations has more 
than one solution, then this set will be linearly dependent. 

As {V1,V2,.--,Vn} is a basis of V and w; € V, for each i, 1 <i < m, there exist scalars a;j, 1<i< 
n, 1<j3<™m, such that 


Wi = 411V1 + 421V2 +°+* + GniVn 
W2 = 412V1 + A22V2 +°++* + Gn2Vn 
Wm = A1mV1 + A2mV2 +++: + AnmVn- 


The set of equations (3.3.1) can be rewritten as 
n n n 
a1 ; aj1Vi + a2 ) Qj2V5 +++ Am ) AjmVj =0 
j=l j=l j=l 


m m m 
ie., (>. oa) Vit be oa) Voter t+ (>. on] Vn = 0. 
i=l i=l i=1 
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Since the set {v1,V2,...,Vn} is linearly independent, we have 


m m m 
S ajay = s Qja2=-° = s QjAnji = O. 
i=l i=l f= 


Therefore, finding a,;’s satisfying equation (3.3.1) reduces to solving the system of homogeneous equations 


G11 12 Gaim 
: a@21 G22 *** 42m . ; 
Aa = 0 where a‘ = (a1,Q2,...,Q@m) and A=] . ; _ |. Since n < m, i.e., THE NUMBER 
GQni AnQ2 "*° anm 


OF EQUATIONS is strictly less than THE NUMBER OF UNKNOWNS, Corollary|2.6.3]implies that the solution 
set consists of infinite number of elements. Therefore, the equation (3.3.1) has a solution with not all 
aj, 1<i<™m, zero. Hence, the set {w1, W2,...,Wm} is a linearly dependent set. 


Remark 3.3.8 Let V be a vector subspace of R" with spanning set S. We give a method of finding a 
basis of V from S. 


1. Construct a matrix A whose rows are the vectors in S. 


2. Use only the elementary row operations R;(c) and R;;(c) to get the row-reduced form B of A (in 


fact we just need to make as many zero-rows as possible). 
3. Let B be the set of vectors in S corresponding to the non-zero rows of B. 


Then the set B is a basis of L(S') = V. 


Example 3.3.9 Let S = {(1,1,1,1), (1,1,—-1,1), (1,1,0, 1), (1, -1,1,1)} be a subset of R*. Find a basis of 


L(S). 
1 1 1 1 
. 1 1 -1 1 : : 
Solution: Here A = ii fal" Applying row-reduction to A, we have 
1 -1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 0 oO -2 0} =—>]0 0O 0 O 
Ry2(—1), Ris(—1), Ria(-1 R3o(_ 
1 1 01 aD. Rial, BuslV 1p gy gf Baa yg 1 0 
1 -1 1 1 0 -2 0 O 0 -2 0 O 


Observe that the rows 1,3 and 4 are non-zero. Hence, a basis of L(S') consists of the first, third and fourth 
vectors of the set S. Thus, B = {(1,1,1,1), (1,1,0,1), (1,—1,1,1)} is a basis of L(S). 

Observe that at the last step, in place of the elementary row operation R32(—2), we can apply Ro3(—$) 
to make the third row as the zero-row. In this case, we get {(1,1,1,1), (1,1,-1,1), (1,—1,1,1)} as a basis 
of L(S). 


Corollary 3.3.10 Let V be a finite dimensional vector space. Then any two bases of V have the same 


number of vectors. 


Proor. Let {uj,u2,...,u,} and {vj, v2,...,Vm} be two bases of V with m > n. Then by the above 
theorem the set {vi1,V2,..-, Vm} is linearly dependent if we take {u;, u2,...,u,} as the basis of V. This 
contradicts the assumption that {v1,v2,...,Vm} is also a basis of V. Hence, we get m =n. 


Definition 3.3.11 (Dimension of a Vector Space) The dimension of a finite dimensional vector space V 


is the number of vectors in a basis of V, denoted dim(V). 
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Note that the Corollary can be used to generate a basis of ANY NON-TRIVIAL FINITE DIMENSIONAL 
VECTOR SPACE. 


Example 3.3.12 — 1. Consider the complex vector space C?(C). Then, 
(a + ib,c+id) = (a +%b)(1,0) + (c + id)(0, 1). 
So, {(1,0), (0, 1)} is a basis of C?(C) and thus dim(V) = 2. 


2. Consider the real vector space C?(R). In this case, any vector 


(a + ib,c+ id) = a(1,0) + b(4,0) + c(0, 1) + d(0, 2). 
Hence, the set {(1,0), (7,0), (0,1), (0,7)} is a basis and dim(V) = 4. 


Remark 3.3.13 It is important to note that the dimension of a vector space may change if the under- 
lying field (the set of scalars) is changed. 


Example 3.3.14 Let V be the set of all functions f : R"—>R with the property that f(x+y) = f(x)+f(y) 
and f(ax) = af (x). For f,g € V, and t € R, define 
(f@g)(x) = f(x) +g(x) and 
(tO f)(x) = f(tx). 
Then V is a real vector space. 
For 1 <i <n, consider the functions 
e; (x) = e;((x1, LQ,-+- ey) = Tj. 
Then it can be easily verified that the set {e1,e2,...,@n} is a basis of V and hence dim(V) = n. 


The next theorem follows directly from Corollary [3.2.6] and Theorem [3.3.7] Hence, the proof is 
omitted. 


Theorem 3.3.15 Let S be a linearly independent subset of a finite dimensional vector space V. Then the 
set S can be extended to form a basis of V. 


Theorem [3.3.15] is equivalent to the following statement: 
Let V be a vector space of dimension n. Suppose, we have found a linearly independent set S = 
{vi,V2,...,v,} C V. Then there exist vectors v;41,-.-,Vn in V such that {v1,vo,...,Vn} is a basis of 
V. 


Corollary 3.3.16 Let V be a vector space of dimension n. Then any set of n linearly independent vectors 


forms a basis of V. Also, every set of m vectors, m > n, is linearly dependent. 


Example 3.3.17 Let V = {(v,w,z,y,z) € R° :v+a2—3y+z=0} and W = {(v,w,2,y,z) € R® 
w—x2—z=0,v=y} be two subspaces of R°. Find bases of V and W containing a basis of VM W. 
Solution: Let us find a basis of VM W. The solution set of the linear equations 
u+a—3y+2=0, w-x-—z=0 and v=y 
is given by 
(v, Ww, xz, Y, 2) = (y, 2y, x, Yy, 2Qy ~ x)* = yQ, 2, 0, iL ay + x(0, 0, 1, 0, -1)'. 
Thus, a basis of VN W is 
{(, 2, 0, 1, 2), (0, 0, 1, 0, mig 


To find a basis of W containing a basis of V 1 W, we can proceed as follows: 
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1. Find a basis of W. 


2. Take the basis of V MW found above as the first two vectors and that of W as the next set of vectors. 
Now use Remark |3.3.8]to get the required basis. 


Heuristically, we can also find the basis in the following way: 
A vector of W has the form (y,x + 2,2,y,2) for x,y,z € R. Substituting y = 1,2 = 1, and z = Oin 
(y,v + 2,2,y, 2) gives us the vector (1,1,1,1,0) € W. It can be easily verified that a basis of W is 


{(, 2, 0, 1, 2), (0, 0, 1 0, =), (1, I, 1, 1,0). 


Similarly, a vector of V has the form (v, w, x, y, 3y—u—2) for v, w, x,y € R. Substituting v = 0,w = 1,2 =0 
and y = 0, gives a vector (0,1,0,0,0) € V. Also, substituting v = 0,w = 1,2 = 1 and y = 1, gives another 
vector (0,1,1,1,2) € V. So, a basis of V can be taken as 


{(, 2, 0, i, 2), (0, 0, 1, 0, =), (0, 1, 0, 0, 0), (0, 1, i, 1, 2)}. 


Recall that for two vector subspaces M and N of a vector space V(F), the vector subspace M + N 
is defined by 
M+N={u+v:ueM, ve N}. 


With this definition, we have the following very important theorem (for a proof, see Appendix [[4.4.1). 


Theorem 3.3.18 Let V(F) be a finite dimensional vector space and let M and N be two subspaces of V. 
Then 
dim(M) + dim(N) = dim(M + N)+dim(Mn N). (3.3.2) 


Exercise 3.3.19 1. Find a basis of the vector space P,,(R). Also, find dim(P,(IR)). What can you say 
about the dimension of P(R)? 


2. Consider the real vector space, C((0, 27]), of all real valued continuous functions. For each n consider 
the vector e,, defined by e,,(2) = sin(na). Prove that the collection of vectors {en :1<n< c}isa 
linearly independent set. 

[Hint: On the contrary, assume that the set is linearly dependent. Then we have a finite set of vectors, 
say {€k,,€k.,---,@k,} that are linearly dependent. That is, there exist scalars a; € R for 1 <i< & not all 


zero such that 
a sin(k1x) + a2 sin(ke%) +---+agsin(kex) =O for all x € [0,27]. 
Now for different values of m integrate the function 
i. sin(ma) (a1 sin(kix) + a2 sin(kex) +---+acsin(kex)) dr 
0 
to get the required result.] 
3. Show that the set {(1,0,0), (1,1, 0), (1,1, 1)} is a basis of C?(C). Is it a basis of C3(R) also? 


4. Let W ={(2,y,z,w) € R*:x+y—2+w =O} bea subspace of R*. Find its basis and dimension. 


5. Let V = {(a,y,z,w) € R¢: et+y—zt+w=0,2+y+2+w =0} and W = {(2,y,z,w) € R*: 
x—-y—z+w =0,2+2y—w = 0} be two subspaces of IR*. Find bases and dimensions of V, W, 
VOW andV+W. 


6. Let V be the set of all real symmetric n x nm matrices. Find its basis and dimension. What if V is the 


complex vector space of all m x m Hermitian matrices? 
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11. 


12. 


13. 


14. 


15. 
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. If M and N are 4-dimensional subspaces of a vector space V of dimension 7 then show that M and 


N have at least one vector in common other than the zero vector. 


. Let P = L{(1,0,0), (1,1,0)} and Q = L{(1,1,1)} be vector subspaces of R?. Show that P+Q = R® 


and PN Q = {0}. If ue R®, determine up, ug such that u= up + Ug where up € P and ug € Q. 
Is it necessary that up and ug are unique? 


. Let W, be a k-dimensional subspace of an n-dimensional vector space V(F) where k > 1. Prove that 


there exists an (n — k)-dimensional subspace W2 of V such that W, M W2 = {0} and W, + Wo = V. 


Let P and Q be subspaces of IR” such that P+ Q = R” and PM Q = {0}. Then show that each 
u € R” can be uniquely expressed as u = up + Ug where up € P and ug € Q. 


Let P = L{(1,—1,0), (1,1,0)} and Q = L{(1,1,1), (1,2,1)} be vector subspaces of IR?. Show that 
P+Q=R? and PNQ F {0}. Show that there exists a vector u € R® such that u cannot be written 
uniquely in the form u = up + ug where up € P and ug € Q. 


Recall the vector space P4(R). Is the set, 
W = {p(x) € Pa(R) = p(—1) = pl) = 0} 
a subspace of P4(R)? If yes, find its dimension. 


Let V be the set of all 2 x 2 matrices with complex entries and a1, + a22 = 0. Show that V is a real 
vector space. Find its basis. Also let W = {A € V : ag, = —ai2}. Show W is a vector subspace of V, 


and find its dimension. 


[! ae ane) 2] | 2-4 0. 6 ] 
Let A= pe he Oe ,and B= ee be two matrices. For A and B find 
—2 4 0 -3 -5 1 -4 
4 2 5 6 10 -1 -1 1 2 


the following: 


a) their row-reduced echelon forms. 


(a) 
(b) the matrices P, and P: such that P,A and P2B are in row-reduced form. 
(c) a basis each for the row spaces of A and B. 
(d) a basis each for the range spaces of A and B. 

(e) bases of the null spaces of A and B. 

(f) the dimensions of all the vector subspaces so obtained. 


Let M/(n, IR) denote the space of all n x n real matrices. For the sets given below, check that they are 
subspaces of M(n,R) and also find their dimension. 
(a) sl(n,R) ={A © M(n,R) : tr(A) = 0}, where recall that tr(A) stands for trace of A. 
S(n,R) ={Ae€ M(n,R) : A= A'}. 
(c) A(n,R) = {Ae M(n,R) : A+ A’ =O}. 


Before going to the next section, we prove that for any matrix A of order m x n 


Row rank(A) = Column rank(A). 
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Proposition 3.3.20 Let A be an m x n real matrix. Then 
Row rank(.A) = Column rank(A). 


ProoFr. Let Ri, Ro,...,Rm be the rows of A and C1,C2,...,Cy be the columns of A. Note that 
Row rank(A) =r, means that 
dim(L(Ri, R2,...,Rm)) =r. 


Hence, there exists vectors 
n 
uy = (ui1,..-,Uin), U2 = (thin 9-2 UDiay Vary Us = (Up1,-++, Urn) ER 


with 
R; € L(y, ue,...,u,) € R”, forall i, 1<i<m. 


Therefore, there exist real numbers a;;, 1 <7<m, 1 <j <r such that 


big Tr Tr 
Ry = a4, + QyQUg + +++ + Q1;-U;, = ( y aiUil, s O1iUi2, +--+ 5 Q1;Uin), 
i=1 i=1 i=1 
Tr + rT 
Rg = a1 Uy + AgQQUg + +++ + Q27-U;p = ( s A2%iUil, S AQ; Ui2, +++ 5 5 Q2;Uin), 
i=1 i=1 i=l 


and so on, till 


r r r 
Rn = Am1U1 + +++ + AmrUp = ( s AmiUil, s AmiVi2,.++5 s Qmitlin): 
t=1 w=1 w=1 


So, 
4 
> ayiil 
ae a1 a12 Q1r 
de Oita 21 a'22 O12 
Ci = | i=l =U11 : + U21 . tere + Up 
ib Am1 Am2 Amr 
ys AmiUil 
i=l 
In general, for 1 < 7 < n, we have 
Tr 
do antag 
oa O11 Q12 Q1r 
De O26 tay 21 029 O12 
Cie et =u]. | tua} 2 | bee tury 
is Am1 Am2 Amr 
> AmiUiz 
i=l 
Therefore, we observe that the columns C1, C2,...,C;, are linear combination of the r vectors 
t t t 
(a1, M21, +++ m1) ; (a12, 22, +++5 m2) pres (Q4,r, ary.++ SOG) : 


Therefore, 
Column rank(A) = dim(L(C),C2,...,C,)) =< r = Row rank(A). 


A similar argument gives 
Row rank(A) < Column rank(A). 


Thus, we have the required result. 
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3.4 Ordered Bases 


Let B = {uj,ug,...,u,} be a basis of a vector space V(F). As B is a set, there is no ordering of its 


elements. In this section, we want to associate an order among the vectors in any basis of V. 


Definition 3.4.1 (Ordered Basis) An ordered basis for a vector space V(F) of dimension n, is a ba- 
sis {u1,U2,...,U,} together with a one-to-one correspondence between the sets {uj,Us,...,u,} and 
1192533 0:.2¢ 7h. 


If the ordered basis has uj, as the first vector, ug as the second vector and so on, then we denote this 
ordered basis by 


(U1, U2,..., Un). 


Example 3.4.2 Consider P2(IR), the vector space of all polynomials of degree less than or equal to 2 with 
coefficients from R. The set {1 — 2, 1+ 2,2x*} is a basis of Po(R). 
For any element ag + a1x + agx? € P2(IR), we have 


ado + ay 
2 


2 ag — ay 


a9 ta,x2 + agx* = (1+ 2) + az. 


(l-—a)+ 


: . ao- a. f dora. 
If (1—a, 1+2, x”) is an ordered basis, then ~ is the first component, -s is the second component, 


and az is the third component of the vector ag + a,x + agx?. 


: do+ay . : ao—- a1 . 

If we take (1 + 2,1 — x,2?) as an ordered basis, then a 5 + is the first component, E is the 
second component, and az is the third component of the vector ag + a,x + ax. 

That is, as ordered bases (uj, U2,..., Un), (U2, U3,.-., Un, U1), and (u,, U1, U2,..., Up»—1) are different 
even though they have the same set of vectors as elements. 
Definition 3.4.3 (Coordinates of a Vector) Let 6 = (vj, v2,...,Vn) be an ordered basis of a vector space 
V(EF) and let v € V. If 

Vv = Pivi + Bove ++++ + BnVn 

then the tuple (61, 82,...,8n) is called the coordinate of the vector v with respect to the ordered basis B. 

Mathematically, we denote it by [v]g = (91,.--,8n)', A COLUMN VECTOR. 

Suppose B; = (uj, U2,...,U,) and By = (uy, U1, U2,..., Un—1) are two ordered bases of V. Then for 
any x € V there exists unique scalars a1, @2,...,@n, such that 


X = QU, + QU +++ + AnUn = AnUyn + A Uy + +++ + An—-1Up-1. 


Therefore, 
[x]e, = (Q1,02,...,Qn)’ and [x]g, = (an,a1,Q2,...,Qn—1)’. 
Note that x is uniquely written as 3 a,;u,; and hence the coordinates with respect to an ordered 
basis are unique. ise 
Suppose that the ordered basis B; is changed to the ordered basis B3 = (ug, ui, U3,...,U,). Then 


[x]B, = (2,01, 03,..., Qn)’. So, the coordinates of a vector depend on the ordered basis chosen. 


Example 3.4.4 Let V = R®. Consider the ordered bases 
By= ((1, 0, 0), (0, 1,0), (0,0, 1)), Bo = (1,0, 0), (1s 1,0); Cd, 1, 1)) and Bs = (1, 1, Ly, (ds 1,0), (1,0, 0)) of 
V. Then, with respect to the above bases we have 


Gani = 
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Therefore, if we write u = (1,—1,1), then 


[uls, 7 Chay (ule. = (2, —2, i [uls, = (1,=2,2)*. 


In general, let V be an n-dimensional vector space with ordered bases 6, = (uj, Ug,...,U,) and 
Bo = (v1, V2,---,Vn). Since, B, is a basis of V, there exists unique scalars a;;, 1 < i,j <n such that 
n 
vi= oan forl<i<n. 
1=1 
That is, for each 1, 1 < a < n, [vi |B, = (aii, AQi,--- engi)”: 
Let v € V with [v]p, = (a1, Q@2,...,@n)'. As By as ordered basis (v1, V2,..-,Vn), we have 


n n n n n 
v= ; QV, = ) ay ; aj,U; = ; ; ajiQi | U;. 
i=1 i=1 jai 


j=1 \i=l 


Since 6, is a basis this representation of v in terms of u,;’s is unique. So, 


n n n t 
y ayia, S AQiAi, +--+, S AniQy 
i=1 i=1 i=1 


[v] 5, 


Q11 Gin ay 

a21 a2n a2 

an a ann An 
= Alv]z. 


Note that the i” column of the matrix A is equal to [vi]g,, i.e., the i‘” column of A is the coordinate 


of the i*” vector v; of By with respect to the ordered basis B,. Hence, we have proved the following 


theorem. 
Theorem 3.4.5 Let V be an n-dimensional vector space with ordered bases 6B, = (uj, Ue,...,U,) and 
Bo = (V1, V2, eae Via) Let 


A= ([vi]e.; [ve]s,, eae) [Vn]e, | : 


Then for any v € V, 


Example 3.4.6 Consider two bases B, = ((1,0,0), (1,1, 0), (1,1, 1)) and By = ((1,1,1), (1,1, 1), (1,1,0)) 


of R®. 
1. Then 
((@¥, 2s, = (@—y)-(1,0,0)+(y—2)- (1,0) + 2-11) 
= (@-yy- 2,2) 
and 
(eye = 5-42-41) +54-(-10 
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2 
2. Let A=[a,;] = |0 —2 1] . The columns of the matrix A are obtained by the following rule: 
1 0 


(1,1, 1)]s, = 0- (1,0,0) +0- (1,1,0) +1- (1,1,1) = (0,0,1)’, 
[(Q,—-1,1)]s, = 2- (1,0,0) + (—2)- (1,1,0) +1- (1,1,1) = (2, -2,1)° 


and 
[(,1,0)]s, =0- (1,0,0) +1- (1,1,0) +0- (1,1,1) = (0,1,0)*. 


That is, the elements of By = ((1, 1,1), (1,1, 1), (1,1,0)) are expressed in terms of the ordered basis 
By. 


3. Note that for any (x,y,z) € R°, 


u-y 0 2 O| |4-+z 
[(x, ¥, z)]B, Y- 4) = 0 2 1 a =A [(2, 9, 2) Bo. 
z 1 1 O L-z 


4. The matrix A is invertible and hence [(z, y,z)]s, = A! [(z,y,2)]p,- 


In the next chapter, we try to understand Theorem|3.4.5]again using the ideas of ‘linear transforma- 


tions / functions’. 


Exercise 3.4.7 1. Determine the coordinates of the vectors (1,2,1) and (4,—2,2) with respect to the 
basis B = ((2, 1,0), (2,1, 1), (2,2,1)) of R®. 
2. Consider the vector space P3(R). 
(a) Show that By = (1—2,1+2?,1—2°,3+ 2? — 2°) and Bo = (1,1—2,14 27,1 — 2?) are bases 
of P3(R). 
(b) Find the coordinates of the vector u=1+a+ x? + x? with respect to the ordered basis B; and 
Bo. 
(c) Find the matrix A such that [u]g, = Alulp,. 
(d) Let v = ao +a 1x + agu? + a3x°. Then verify the following: 
ty 
—ag — ay + 2a2 — a3 
[v]e, = 
—ag — ay + ag — 2a3 


ag + a, — a, + a3 


0 1 0 0 ag + a1 — a2 +43 
_ |-1 0 1 0 ~ay 
Lash Ae Ge Al ay 

1 0 0 0 —a3 


Chapter 4 


Linear Transformations 


4.1 Definitions and Basic Properties 
Throughout this chapter, the scalar field F is either always the set R or always the set C. 


Definition 4.1.1 (Linear Transformation) Let V and W be vector spaces over F. A map T : V—>W is 
called a linear transformation if 


T(au+ bv) = aT(u) + BT (Vv), for alla,@G€F, andu,veé V. 
We now give a few examples of linear transformations. 


Example 4.1.2 1. Define T : R—+R? by T(x) = (x, 32) for all 2 € R. Then T is a linear transformation 
as 
Tet+y)=(@+y,3@ + y)) = (@, 3x) + (y, 38y) = Te) + Ty). 


2. Verify that the maps given below from R” to R are linear transformations. Let x = (a1, 22,...,%n). 
(a) Define T(x) = >> aj. 
i=1 
(b) For anyi, 1<i<n, define T;(x) = %. 


(c) For a fixed vector a = (a1, d2,...,@n) € R”, define T(x) = >> aja;. Note that examples (a) 
i=1 


and (b) can be obtained by assigning particular values for the vector a. 


3. Define T : R?-—>R? by T((z,y)) = (a@ + y, 2u — y, x + 3y). 
Then T is a linear transformation with T((1,0)) = (1,2,1) and T((0,1)) = (1, 1,3). 


4. Let A be an m x n real matrix. Define a map 7T'4 : R"—>R”™ by 
Ta(x) = Ax forevery x’ = (21,22,...,2n) € R”. 


Then 74 is a linear transformation. That is, every m x n real matrix defines a linear transformation 
from R” to R™. 


5. Recall that P,,(R) is the set of all polynomials of degree less than or equal to n with real coefficients. 
Define T : R"t!—+P,,(R) by 


T((G1, @9,.4+5An41)) = G1 + age +-*!+ @pygi2” 


for (a1, @2,..-,@n41) € R"*1. Then T is a linear transformation. 
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Proposition 4.1.3 Let T : V—>W be a linear transformation. Suppose that Ov is the zero vector in V and 
Ow is the zero vector of W. Then T(Oy) = Ow. 


PROOF. Since Oy = Oy + Oy, we have 


T(Ov) =T(Oy + Oy) = Tv) +T(Ov). 


So, T(Ov) = Ow as T(Oy) € W. 


From now on, we write 0 for both the zero vector of the domain space and the zero vector of the 


range space. 


Definition 4.1.4 (Zero Transformation) Let V be a vector space and let T : V-—+W be the map defined 
by 
T(v) =0 for every veV. 


Then T is a linear transformation. Such a linear transformation is called the zero transformation and is 
denoted by 0. 


Definition 4.1.5 (Identity Transformation) Let V be a vector space and let T : V—+V be the map 
defined by 
T(v) =v for every ve V. 


Then T is a linear transformation. Such a linear transformation is called the Identity transformation and is 
denoted by I. 


We now prove a result that relates a linear transformation T’ with its value on a basis of the domain 


space. 
Theorem 4.1.6 Let T : V—>W be a linear transformation and B = (uj, Ug,...,U,) be an ordered basis 
of V. Then the linear transformation T is a linear combination of the vectors T(ui),7(uz),...,T (un). 


In other words, T' is determined by T'(uj), T(uz),...,7 (un). 


Proor. Since B is a basis of V, for any x € V, there exist scalars aj,Q2,...,Q@, such that x = 


au, + agug +--+ + Q,Uy. So, by the definition of a linear transformation 


T(x) = T(aquy ++ +++ QnUn) = aT (uy) +++ + OnT (un). 


Observe that, given x € V, we know the scalars a1, a2,...,Q,. Therefore, to know T(x), we just need 
to know the vectors T(uz), T(u2),...,7(un) in W. 
That is, for every x € V, T(x) is determined by the coordinates (a1, @2,...,@,) of x with respect to 


the ordered basis B and the vectors T(uz), T(u2),...,7'(un) € W. 


Exercise 4.1.7 1. Which of the following are linear transformations T : V-—>W? Justify your answers. 


(a) Let V = R? and W = R? with T( (2,y) ) = (w@t+y41,2¢- y,2 + 3y) 

(b) Let V = W = R? with T( (x,y) ) = (x—y, 2? — y”) 

(c) Let V = W =R? with T( (2,y) ) = («—y, ||) 

(d) Let V = R? and ao with T( (z,y) ) =(a@+y,e—y, 2a + y, 3a — 4y) 
(e) Let V = W =R¢‘ with T( (z,y,z,w) ) = (z,2,w,y) 


2. Recall that M/2(IR) is the space of all 2 x 2 matrices with real entries. Then, which of the following are 
linear transformations T : M2(IR)—>+M2(R)? 
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(a) T(A) = At (b) T(A)=I+A (0) T(A) = A? 
(d) T(A) = BAB™, where B is some fixed 2 x 2 matrix. 


. Let 7: R —> R bea map. Then T is a linear transformation if and only if there exists a unique c € R 


such that T(x) = cx for every x ER. 


. Let A be an n X n real matrix. Consider the linear transformation 


Ta(x) = Ax for every x € R”. 


Then prove that T?(x) := T(T'(x)) = A?x. In general, for k € N, prove that T*(x) = A*x. 


. Use the ideas of matrices to give examples of linear transformations 7, S : R8—+R?° that satisfy: 


(a) PFO TSO. T= 0. 

(b) T#0, S40, SoT #0, ToS =O; where To S(x) = T(S(x)). 
QS a4 5 7) 

(d) T2=1, THI. 


. Let T : R™ —> R® be a linear transformation such that T #4 O and T? = O. Let x € R” such 


that T(x) 4 0. Then prove that the set {x,7(x)} is linearly independent. In general, if T* 4 0 
for 1 < k < p and T+! = O, then for any vector x € R” with T?(x) 4 O prove that the set 
{x, T(x),...,7?(x)} is linearly independent. 


. Let T: R” —> R™ be a linear transformation, and let xo € R” with T'(xo) = y. Consider the sets 


S={xeR":T(x)=y} and N={xe€R”: T(x) =0}. 


Show that for every x € S' there exists z © N such that x = x9 + z. 


. Define a map T' : C —> C by T(z) = Z, the complex conjugate of z. Is T linear on 


(a) CoverR (b) C over C. 


. Find all functions f : R? —+ R? that satisfy the conditions 


(a) f( (a, a) ) = (a, a) and 

(b) f( (x,y) ) = (y, 2) for all (x,y) € R?. 
That is, f fixes the line y = x and sends the point (21, y1) for 21 4 yi to its mirror image along the 
line y= a. 


Is this function a linear transformation? Justify your answer. 


Theorem 4.1.8 Let T : V—>W bea linear transformation. For w € W, define the set 


T '(w) ={veEV:T(v) =wh. 


Suppose that the map T’ is one-one and onto. 


1. Then for each w € W, the set T~'(w) is a set consisting of a single element. 


2. The map 7’! : W—V defined by 


T~'(w) =v whenever T(v) = w. 


is a linear transformation. 
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ProoFr. Since T is onto, for each w € W there exists a vector v € V such that T(v) = w. So, the set 
T~'(w) is non-empty. 

Suppose there exist vectors vi, v2 € V such that T(v1) = T(v2). But by assumption, T is one-one 
and therefore vj = v2. This completes the proof of Part 1. 

We now show that T'~! as defined above is a linear transformation. Let w,,w2 € W. Then by Part 1, 
there exist unique vectors v1,v2 € V such that T~'(w1) = vi and T~!(w2) = ve. Or equivalently, 
T(v1) = w; and T(v2) = wo. So, for any a1, a2 € F, we have T(aiv1 + a2v2) = a1wi + a2We. 


Thus for any a1, Q2 € F, 


T '(aywi + Q2W2) = Q1V1 + AgV2 = a,T~ (wi) + a2T~!(w2). 


Hence T—! : W—3V, defined as above, is a linear transformation. 


Definition 4.1.9 (Inverse Linear Transformation) Let T : V—+W bea linear transformation. If the map 
T is one-one and onto, then the map 7'—! : W—>V defined by 


T~'(w) =v_ whenever T(v) = w 
is called the inverse of the linear transformation T. 


Example 4.1.10 1. Define T : R?—>R? by T((z,y)) = (x+y,x— y). Then T~! : R?—>R? is defined 


b 
y - T+y x-y 
T (2,9) = (3) a) 
Note that 
ToT '((x,y)) = T(Z-*(e,y))) = TAS, =) 
5 Cr ey ey, 
2 2° 2 . 
= (2,y). 


Hence, T’'o I~! = TI, the identity transformation. Verify that T~! oT = I. Thus, the map J! is 


indeed the inverse of the linear transformation 7. 
2. Recall the vector space P,,(R) and the linear transformation T : R"*1—+P,,(IR) defined by 
T ((d1, @2,---,;@n41)) = G1 + gt + +++ + An412" 
for (a1, @2,..-,@n41) € R"*!. Then T-!: P,(R)—>R”*? is defined as 
T (at + aot +++++an1i2”) = (a1, G2, ...,@n41) 


for a, + ag@ +++: +@n412" € Py(R). Verify that To T~1 = T-1oT =I. Hence, conclude that the 


map T+ is indeed the inverse of the linear transformation T. 


4.2 Matrix of a linear transformation 


In this section, we relate linear transformation over finite dimensional vector spaces with matrices. For 
this, we ask the reader to recall the results on ordered basis, studied in Section 

Let V and W be finite dimensional vector spaces over the set F with respective dimensions m and n. 
Also, let T : V—+W be a linear transformation. Suppose By = (vi, V2,...,Vn) is an ORDERED BASIS of 
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V. In the last section, we saw that a linear transformation is determined by its image on a basis of the 
domain space. We therefore look at the images of the vectors v; € 6; forl <j <n. 
Now for each j7, 1 < 7 < n, the vectors T(v;) € W. We now express these vectors in terms of 


an ordered basis By = (w1,W2,...,Wm) of W. So, for each j, 1 < j < n, there exist unique scalars 
Q1j,Q2j,--+,Amj © F such that 

T(vi) = aw + 91 W2 +--+ +@m1Wm 

T(v2) = a@i2W1 + a92W2 +--+ + Gm2Wm 

T (vn) = AinW1 + AgnW2 +++: +4mnWm.- 


m 
Or in short, T(v;) = > a,jw, for 1 < j <n. In other words, for each j, 1 < 7 < n, the coordinates of 


i=l 
T(v,;) with respect to the ordered basis By is the column vector [a1;,@2;,...,@mj]'. Equivalently, 
aij 
a25 
[Tv le= | . 
amj 
Let [x]s, = [r1,22,...,2n]’ be the coordinates of a vector x € V. Then 
n nm 
T(x) = TO) sys) => 2;T;) 
j=l j=l 
n m 
= do ai(d/ agwi) 
j=l i=l 
m n 
= DO aigas)wi 
i=1 j=l 
Q11 a2 -"° Gin 
; a21 422 ""*  G2n : . 
Define a matrix A by A = . Then the coordinates of the vector T(x) with 
y : : nS : 
QAml1 Am2 pee Amn 
respect to the ordered basis Bo is 
Ee Q1j25 Q11 12  *'* Gin XY 
es a2; 25 a21 a22 °°" a2 x2 
jal Ajj n 
[T(x)]s, = ; = 
ae Amy tj Aml1 Am2 ara Amn In 


= A [x]z,- 


The matrix A is called the matrix of the linear transformation T with respect to the ordered bases B, 
and By, and is denoted by T[B1, Ba]. 
We thus have the following theorem. 


Theorem 4.2.1 Let V and W be finite dimensional vector spaces with dimensions n and m, respectively. 
Let T : V—>+W be a linear transformation. If 6; is an ordered basis of V and Bz is an ordered basis of W, 
then there exists an m x n matrix A = T[B,, By] such that 


[T(x)]s, =A [a]n,- 
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Remark 4.2.2 Let By = (v1,V2,...,Vn) be an ordered basis of V and By = (wi, W2,...,Wm) be an 


ordered basis of W. Let T : V —> W be a linear transformation with A = T[B,, 62]. Then the first 


th 


column of A is the coordinate of the vector T(v1) in the basis By. In general, the i" column of A is the 


coordinate of the vector T(v;) in the basis Bo. 


We now give a few examples to understand the above discussion and the theorem. 


Example 4.2.3. 1. Let T: R?->R? be a linear transformation, given by 
T( (#4) )=(@+y,e—y). 
We obtain T[6,, Bz], the matrix of the linear transformation T’ with respect to the ordered bases 
B, = ((1,0),(0,1)) and By =((1,1),(1,-1)) of R’. 
For any vector 


fon) eR, ele, = [5 
y 


as (x,y) = x(1,0) + y(0,1). Also, by definition of the linear transformation T, we have 
and 


That is, [7( (0,1) )Jz, = (0,1)*. So the T[B:, Bs] = |* 


i . Observe that in this case, 


[T( (x,y) 6. = [(x ae Y) |B. = z(1, 1) oF y(1, =1) = A , and 


T(B1, Bo] [(z, y) |B, = j i A = 


2. Let By = ((1,0,0), (0,1,0), (0,0,1)), Bs = ((1,0,0), (1, 1,0), (1,1, 1)) be two ordered bases of R?. 


= |T((@,y) J]. 


Define 
T:R°—>R® by T(x) =x. 
Then 
T((1,0,0)) = 1-(1,0,0)+0-(1,1,0)+0-(1,1,1), 

T((0,1,0)) = -—1-(1,0,0)+1-(1,1,0)+0-(1,1,1), and 

T((0,0,1)) = 0O-(1,0,0)+(—1)-(1,1,0)+1-(,1,1). 
Thus, we have 

T[Bi,B2| = [[7((1,0,0))]e., [7((0,1,0))])a., [P((0, 0, 1))|a.] 


50:0)" (a 1,0)’, (0, —l, 1)4] 


1 0 0 
Similarly check that T[B,,6;]= |0 1 0 
0 0 1 
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3. Let T : R'—+R? be define by T((x,y, z)) = (w+ y— z,x +2). Let By = ((1,0,0), (0, 1,0), (0,0, 1)) 
and Bz = ((1,0),(0,1)) be the ordered bases of the domain and range space, respectively. Then 


et 
ie 0a: | 


Check that that [T (2, y, z)|B, = T[Bi, Ba] [(2,y, z)]p,. 


T[B,, Bz] = 


Exercise 4.2.4 Recall the space P,,(IR) ( the vector space of all polynomials of degree less than or equal to 
n). We define a linear transformation D : P,,(R)—+P,(R) by 


D(ag + a1 + agx? +--+ + an2") = a, + Qagr +--+ +nanx™ 


Find the matrix of the linear transformation D. 


However, note that the image of the linear transformation is contained in P,—1(R). 
Remark 4.2.5 — 1. Observe that 
T|Bi, Bo] = ([T(v1) Iz.» [T(v2)]B. ware [T(vn)]B21- 


2. It is important to note that 
[T'(x)]a. = T[B1, Be} [x]n,- 


That is, we multiply the matrix of the linear transformation with the coordinates [x]g,, of the 
vector x € V to obtain the coordinates of the vector T(x) € W. 


3. If A is an m x n matrix, then A induces a linear transformation T, : R"—>+R", defined by 
Ta(x) = Ax. 


We sometimes write A for T,4. Suppose that the standard bases for R” and R™ are the ordered 
bases B, and Bg, respectively. Then observe that 


TB BSA 


4.3. Rank-Nullity Theorem 


Definition 4.3.1 (Range and Null Space) Let V, W be finite dimensional vector spaces over the same set 
of scalars and JT’ : V—+W bea linear transformation. We define 


1. R(T) = {T(x): x € V}, and 


2. N(T) ={x€ V : T(x) = 0}. 


Proposition 4.3.2 Let V and W be finite dimensional vector spaces and let T : V—+W be a linear trans- 


formation. Suppose that (v1, v2,..., Vn) is an ordered basis of V. Then 
1. (a) R(T) is a subspace of W. 
(b) R(T) = L(T(v1), T(v2),---,T(Vn))- 


(c) dim(R(T)) < dim(W). 
2. (a) N(T) is a subspace of V. 
(b) dim(V(T)) < dim(V). 
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3. T is one-one <= > N(T) = {0} is the zero subspace of V <= > {T(uj):1 <i <n} is a basis of 
R(T). 


4. dim(R(T)) = dim(V) if and only if M(T) = {0}. 


Proor. The results about R(T) and V(T) can be easily proved. We thus leave the proof for the 
readers. 
We now assume that T' is one-one. We need to show that V(T) = {0}. 
Let u € N(T). Then by definition, T'(u) = 0. Also for any linear transformation (see Proposition[4.1.3), 
T(0) = 0. Thus T(u) = T(0). So, T is one-one implies u = 0. That is, V(T) = {0}. 

Let V(T) = {0}. We need to show that T is one-one. So, let us assume that for some u,v € 
V, T(u) = T(v). Then, by linearity of T, T(u—v) = 0. This implies, u— v € N(T) = {0}. This in turn 


implies u = v. Hence, T is one-one. 


The other parts can be similarly proved. 


Remark 4.3.3. 1. The space R(T) is called the RANGE SPACE of T and N(T) is called the NULL 
SPACE of T. 


2. We write p(T) = dim(R(T)) and v(T) = dimW(T)). 


3. p(T) is called the rank of the linear transformation T and v(T) is called the nullity of T. 


Example 4.3.4 Determine the range and null space of the linear transformation 


T:R®— R* with T(s,y,2) = (e¢-—y+z,y— 2,2, Qe — 5y + 52). 


Solution: By Definition R(T) = L(T(1, 0,0), T(0, 1,0), T(0, 0, 1)). We therefore have 


R(T) = ECL, D125 1,1,0,—5), 4, 1,0,5)) 
= £((1,0,1,2),0,—1,0,5)) 
= {a(1,0,1,2)+ 6(1,-1,0,5) :a,6e€ R} 
= {(a+6,—-8,a,2a+58) :a,6B€R} 
= {(z,y,2,w) €R* :2+y—2=0,5y—22+w=0}. 


Also, by definition 


N(T) = {(@,y,z)€R* :T(a,y,z) = 0} 
= {(z,y,z)€R® :(f«-—y+z,y— 2,2, 22 —5y+5z) = 0} 


= {(z,y,z)€R? :2-y+2=0,y-—27=0, 
x =0,2¢ — 5y + 5z = 0} 


= {(x,y, 2) € R3 :y—z=0,2 =0} 
= {(2,y,2)€R° :y=2,2=0} 

> {(0,y, y) € R3 °Y arbitrary} 

= L((0,1,1)) 


Exercise 4.3.5 1. Let T : V—+W be a linear transformation and let {T(vi),T(v2),...,T(wvn)} be 
linearly independent in R(T’). Prove that {v1, v2,.--, Vn} C V is linearly independent. 
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2. Let T': R?—>R? be defined by 
P((1,0)) = (10,0), 7 ((0,1)) 1,00), 


Then the vectors (1,0) and (0, 1) are linearly independent whereas T'((1,0)) and T'((0, 1)) are linearly 
dependent. 


3. Is there a linear transformation 


T : R? — R? such that T(1,—1,1) =(1,2), and T(—1,1,2) = (1,0)? 


4. Recall the vector space P,, (IR). Define a linear transformation 
D :P,,(R)—Prn(R) 


by 
D(ap + ayx + dor? + +++ + AnX") = a, + 2agu +---+ nanw”!. 


Describe the null space and range space of D. Note that the range space is contained in the space 
Pn—-1(R). 


5. Let T : R° —> R? be defined by 
T(1,0,0)= (0,01), TG, 10) =(1.4y 1) and FH 1) = 51,0). 
(a) Find T(a, y, z) for x,y,z € R, 


(b) Find R(T’) and NT). Also calculate p(T’) and v(T’). 


(c) Show that J? = T and find the matrix of the linear transformation with respect to the standard 
basis. 


6. Let T’: R? —> R? be a linear transformation with 


Find the matrix representation T'[B,B] of T with respect to the ordered basis B = ((1,0),(1,1)) of 
R?. 


7. Determine a linear transformation T : R® —+ R® whose range space is L{(1, 2,0), (0,1, 1), (1,3,1)}. 
8. Suppose the following chain of matrices is given. 


A— B, — B, — Bo--- — Br_1 — By — B. 


If row space of B is in the row space of By and the row space of B; is in the row space of B;_1 for 
2<1<k then show that the row space of B is in the row space of A. 


We now state and prove the rank-nullity Theorem. This result also follows from Proposition [4.3.2] 


Theorem 4.3.6 (Rank Nullity Theorem) Let T : V—+W be a linear transformation and V be a finite 


dimensional vector space. Then 
dim(R(T)) + dim(N(T)) = dim(V), 


or equivalently p(T) + v(T') = dim(V). 
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Proor. Let dim(V) = n and dim(NV(T)) = r. Suppose {u1,u2,...,ur} is a basis of N(T). Since 


{ui, u2,...,U,} is a linearly independent set in V, we can extend it to form a basis of V (see Corollary 
3.3.15). So, there exist vectors {u,p+41, Ur+2,---,Un} such that {u1,...,Ur,Urti,---,Un} is a basis of V. 


Therefore, by Proposition [4.3.2] 


RT) = DT), Tes),...<T ey) 
= MOac0 Tea). T rie) sj Tua) 
SS DG NT ee POR: 


We now prove that the set {T(ur+1), T(ur+2),...,T(tn)} is linearly independent. Suppose the set is 
not linearly independent. Then, there exists scalars, @;41,Q@p+2,---,Qn, not all zero such that 


Op41T (Up+1) + Ap+42T (Ur+2) a a nT (Un) = 0. 


That is, 


T (Qpqi1Urp1 + Oppolrg2 +++: + Antn) = 0. 


So, by definition of N(T), 
Ar41Ur+1 + Ap+2Ur4+2 tre + AnUn € N(T) = Lui, ony Ur). 
Hence, there exists scalars a;, 1 <i <r such that 


Or+1Urti + Ap42Ur42 +++ + AnUn = AU + AQU2 + +++ + ApUr. 


That is, 

QU +++ + Opp — App 1Ury1 — +++ — AnUn = O. 
But the set {wi,u2,...,Un} is a basis of V and so linearly independent. Thus by definition of linear 
independence 


a; =0 forall i, l<i<n. 


In other words, we have shown that {T(ur+1), T(ur+2),...,T'(un)} is a basis of R(T). Hence, 


dim(R(T)) + dim(V(T)) = (n-r) +r =n = dim(V). 


Using the Rank-nullity theorem, we give a short proof of the following result. 
Corollary 4.3.7 Let T : V—+V bea linear transformation on a finite dimensional vector space V. Then 
T is one-one <=> T is onto <=> T is invertible. 


Proor. By Proposition [4.3.2] T is one-one if and only if V(T) = {0}. By the rank-nullity Theorem 
.3.6)N(T) = {0} is equivalent to the condition dim(R(T)) = dim(V). Or equivalently T is onto. 
By definition, T is invertible if T is one-one and onto. But we have shown that T is one-one if and 


only if J’ is onto. Thus, we have the last equivalent condition. 


Remark 4.3.8 Let V be a finite dimensional vector space and let T : V—+V be a linear transformation. 
If either T is one-one or T is onto, then T is invertible. 


The following are some of the consequences of the rank-nullity theorem. The proof is left as an 


exercise for the reader. 
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Corollary 4.3.9 The following are equivalent for an m x n real matrix A. 


iL, 


2. 


Rank (A) =k. 


There exist exactly k rows of A that are linearly independent. 


. There exist exactly & columns of A that are linearly independent. 


. There is a k x k submatrix of A with non-zero determinant and every (A +1) x (& +1) submatrix of 


A has zero determinant. 


. The dimension of the range space of A is k. 


. There is a subset of R™ consisting of exactly & linearly independent vectors b;,b2,..., bx such that 


the system Ax = b,; for 1 <i < k is consistent. 


. The dimension of the null space of A=n—k. 


Exercise 4.3.10 1. Let T: V—>W bea linear transformation. 


2. 


3. 


5. 


(a) If V is finite dimensional then show that the null space and the range space of T are also finite 


dimensional. 
(b) If V and W are both finite dimensional then show that 
i. if dim(V) < dim(W) then T is onto. 
ii. if dim(V) > dim(W) then T is not one-one. 


Let A be an m X n real matrix. Then 


(a) if n > m, then the system Ax = 0 has infinitely many solutions, 


(b) if m < _m, then there exists a non-zero vector b = (bi, b2,...,bm)* such that the system Ax = b 


does not have any solution. 


Let A be an m x n matrix. Prove that 

Row Rank (A) = Column Rank (A). 

[Hint: Define T4 : R°>—>R™ by Ta(v) = Av for all v € R”. Let Row Rank (A) =r. Use Theorem 
[2.6.1|to show, Ax = 0 has n—r linearly independent solutions. This implies, 

v(T4) = dim({v € R”: T4(v) = 0}) = dim({v € R” : Av = 0}) =n-r. 

Now observe that R(T 4) is the linear span of columns of A and use the rank-nullity Theorem|4.3.6| 
to get the required result.] 


. Prove Theorem |2.6.1 


[Hint: Consider the linear system of equation Ax = b with the orders of A,x and b, respectively 
asmxn,n x1 andm x 1. Define a linear transformation T : R°—>R™ by T(v) = Av. First 
observe that if the solution exists then b is a linear combination of the columns of A and the linear 
span of the columns of A give us R(T). Note that p(A) = column rank(A) = dim(R(T)) = &(say). 
Then for part 1) one can proceed as follows. 

i) Let Ci,,Ci,,...,Ci, be the linearly independent columns of A. Then rank(A) < rank([{A }]) 


ig 
implies that {Ci,,Ci,,...,Ci,,b} is linearly independent. Hence b ¢ L(Ci,,Ci,,...,Ci,). Hence, 


the system doesn’t have any solution. 


On similar lines prove the other two parts.] 


Let T,S : V—>V be linear transformations with dim(V) = n. 
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(a) Show that R(T +S) C R(T) + R(S). Deduce that p(T + S$) < p(T) + p(S). 
Hint: For two subspaces M,N of a vector space V, recall the definition of the vector subspace 
M+N. 


(b) Use the above and the rank-nullity Theorem [4.3.6]to prove v(T' + S) > v(T)+v(S) —n. 


6. Let V be the complex vector space of all complex polynomials of degree at most n. Given & distinct 


complex numbers 21, 22,..., 2k, we define a linear transformation 
T:V—+C* by T(P(z)) = (Pla), P(z2),--.,P(z))- 
For each k > 1, determine the dimension of the range space of 7. 


7. Let A be an n x n real matrix with A? = A. Consider the linear transformation T, : R” —> R”, 
defined by T'4(v) = Av for all v € R”. Prove that 


(a) Ta 0T4 = Ta (use the condition A? = A). 


(b) N(Z4) 1 R(La) = {0}. 
Hint: Letx € N(Ta)N R(T). This implies T4(x) = 0 andx = Ta(y) for some y € R”. So, 


x = Ta(y) = (Ta 0 Ta)(y) = Ta(Ta(y)) = Ta(x) = 0. 


(c) RX =N(Ta) + R(Ta). 
Hint: Let {vi,..., vx} be a basis of N(T4). Extend it to get a basis {v1,...,Vk;Vkt1,--+; Wn} 
of R". Then by Rank-nullity Theorem|Z.3.6, {Ta(ve41),---,TA(vn)} is a basis of R(T). 


4.4 Similarity of Matrices 


In the last few sections, the following has been discussed in detail: 

Given a finite dimensional vector space V of dimension n, we fixed an ordered basis GB. For any v € V, 
we calculated the column vector [v]g, to obtain the coordinates of v with respect to the ordered basis 
B. Also, for any linear transformation T : V—>V, we got an n x n matrix T[B, B], the matrix of T with 
respect to the ordered basis 6. That is, once an ordered basis of V is fixed, every linear transformation 
is represented by a matrix with entries from the scalars. 

In this section, we understand the matrix representation of T’ in terms of different bases 6; and 
Bo of V. That is, we relate the two n x n matrices T[B,, 6;| and T[B2, Bz]. We start with the following 
important theorem. This theorem also enables us to understand WHY THE MATRIX PRODUCT IS DEFINED 
SOMEWHAT DIFFERENTLY. 


Theorem 4.4.1 (Composition of Linear Transformations) Let V, W and Z be finite dimensional vec- 
tor spaces with ordered bases B,, Bz, B3, respectively. Also, let T : V—>+W and S : W— Z be linear 
transformations. Then the composition map So 7’: V-—+Z is a linear transformation and 


(S oT) [B1, B3] = S[B2, Bs] T[B1, Ba). 


Proor. Let 6; = (uj, us,...,Un), By = (v1, V2,..-, Vm) and Bs = (wi, We2,..., Wp) be ordered bases 
of V,W and Z, respectively. Then 


($0 T) [B1, Ba] = [[$ 0 T(w1)]e,, [So T(ua)]a,---,[S0T (un). 
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Now for 1 <t<n, 


(SoT) (uw) = )=9(Soer [B:, Be}) )nv)) = yen (B1, Bo]) j5(vs) 


j=l 


8 


= SO(T{Bi, Be) it $0 (S[B2, Bs))njwe 


=. k=1 


= DOs [B2, Bs))xj(T [B1, B2])j¢)w 


& 


= 5° (S[Bo, Bs] T[B1, Bel) awe. 


So, 
[(S oT) (uz)]a, = (($[B2, Bs] T[Bi, Ba])s,-.-, (S[B2, Bs] T[Bi, Bo]) pt)’. 


Hence, 


(So T) (B1, Bs] = [ [(S 0 T) (u1)]B5; tees [(S 0 T) (un)] Bs | = S|Bo, Bs] T (By, Bo]. 


This completes the proof. 


Proposition 4.4.2 Let V bea finite dimensional vector space and let T,S : V—+V bea linear transforma- 
tions. Then 
V(T)+v(S) >v(ToS) > max{y(T), v(S)}. 
PROOF. We first prove the second inequality. 
Suppose that v € N(S). Then To S(v) = T(S(v)) = T(0) = 0. So, N(S) C N(T oS). Therefore, 
v(S)<v(LoS). 
Suppose dim(V) = n. Then using the rank-nullity theorem, observe that 


V(ToS)>vrv(T) Ss n-v(ToS)<n—-v(T) <= p(ToS) < p(T). 


So, to complete the proof of the second inequality, we need to show that R(T oS) Cc R(T). This is true 
as R(S) CV. 


We now prove the first inequality. 
Let k = v(S) and let {vi,v2,..., ve} be a basis of N(S). Clearly, {vi,v2,...,ve} C N(T oS) as 
T(0) = 0. We extend it to get a basis {vi, v2,..., Vx, U1, U2,-.., uc} of N(T 0S). 

Claim: The set {S(u), S(ug),...,S(ue)} is linearly independent subset of V(T). 

As Uj, Ug,...,ue € N(T' 0S), the set {S(u1), S(ug),...,S(ue)} is a subset of V(T). Let if possible 
the given set be linearly dependent. Then there exist non-zero scalars c1,c2,...,¢e¢ such that 


c1S(u1) + coS(ug) +--+ + c¢S(ue) = 0. 


So, the vector y cu; € N(S) and is a linear combination of the basis vectors v1, v2,...,vx% of N(S). 


Therefore, here ait scalars @1,Q2,a% such that 


Or equivalently 


82 CHAPTER 4. LINEAR TRANSFORMATIONS 


That is, the O vector is a non-trivial linear combination of the basis vectors v1, V2,...,Vx, U1, U2,..., Ue 
of N(T oS). A contradiction. 
Thus, the set {S(u), S(uzg),...,$(uz)} is a linearly independent subset of N(T’) and so v(T) > &. 
Hence, 
V(ToS)=k+l<v(S)+v(T). 


Recall from Theorem that if T is an invertible linear Transformation, then T~! : V-—3V isa 
linear transformation defined by T~!(u) = v whenever T(v) = u. We now state an important result 


about inverse of a linear transformation. The reader is required to supply the proof (use Theorem[4.4.1). 


Theorem 4.4.3 (Inverse of a Linear Transformation) Let V be a finite dimensional vector space with 
ordered bases 8B; and By. Also let T : V—+V be an invertible linear transformation. Then the matrix of T’ 
and T—! are related by 

T|B;,63)"* =T*|[Be, Bi]. 


Exercise 4.4.4 For the linear transformations given below, find the matrix T/B, B]. 


1. Let B = ((1,1,1), (1,—1, 1), (1,1, -1)) be an ordered basis of R°. Define T : R'—+R® by T(1, 1,1) = 
(1,-1,1), T(1,-1,1) = 0,1,-1), and T(1,1,-1) = (1,1,1). Is T an invertible linear transforma- 
tion? Give reasons. 


2. Let B = (1,a,x?,23)) be an ordered basis of P3(IR). Define T : P3(IR)—+P3(R) by 


T(1) =1,T (x) =14+2,T(2?) = (1+2)?, and T(2*) = (1+ 2)°. 


Prove that T is an invertible linear transformation. Also, find T~1[B, B}. 


Let V be a vector space with dim(V) = n. Let B, = (ui, ug,...,U,) and By = (vi, v2,..., Vn} be 
two ordered bases of V. Recall from Definition [4.L.5]that I: V—3V is the identity linear transformation 
defined by I(x) = x for every x € V. Suppose x € V with [x]g, = (a1,Q2,...,Qn)’ and [x]g, = 
(Bis Bag dsp ia): 

We now express each vector in Bz as a linear combination of the vectors from B,. Since v; € V, for 
1<i<vn, and B, is a basis of V, we can find scalars a;;,1 < 7,7 < n such that 


v; =I(vi) = Sayin, for all i,.1<i<n. 


j=l 
Hence, [I(vi)]8, = [Véla, = (@14, @2:,-++ , @ns)” and 
I[B2, Bi] _ ([Z(v1)]e,, Z(ve)|a,,---5 (vn), ] 
Qi1 412 Qin 
a21 422 a2n 
an aAn2 nays ann 


Thus, we have proved the following result. 


Theorem 4.4.5 (Change of Basis Theorem) Let V bea finite dimensional vector space with ordered bases 
By, = (wi, U2,...,Un} and By = (v1, V2,...,Vn}. Suppose x € V with [x]~g, = (a1,Q2,...,Q,)* and 
[x]e, = (61, G2,---;8n)*®. Then 

[x]s, = I[B2, B61] [x]z,. 
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Equivalently, 
ay a11 =a12 Gin By 
a2 a21 422 a2n Bo 
An An aAn2 eet Ann Bn 


Note: Observe that the identity linear transformation I : V-—+V defined by I(x) = x for every 
x € V is invertible and 
I (Bz, By)~* = I~*[B1, Bo] = TB, Bo]. 


Therefore, we also have 
[x]z. = TB, Bo] [x]z,- 


Let V be a finite dimensional vector space and let B; and Bz be two ordered bases of V. Let T : V-—>V 
be a linear transformation. We are now in a position to relate the two matrices T[B,, B,| and T[Bo, By]. 


Theorem 4.4.6 Let V be a finite dimensional vector space and let 6, = (uj,Ug,...,U,) and By = 
(V1, V2,---,Vn) be two ordered bases of V. Let T : V—>V bea linear transformation with B = T[B,, Bi] 
and C' = T|Bg, By] as matrix representations of T in bases B, and Bo. 

Also, let A = [a;;] = I[B2, Bi], be the matrix of the identity linear transformation with respect to the 
bases B, and By. Then BA = AC. Equivalently B = ACA™!. 


Proor. For any x € V, we represent [T(x)]s, in two ways. Using Theorem the first expression is 
[T(x)]B, = T[Bo, Bo] [x]p,. (4.4.1) 
Using Theorem the other expression is 


[T(x)]p. = 1/B1, Be] [T(x)]s, 
= 1|B,, Be] T|Bi, Bi] [x|e, 
= T[B,, Bo] T (Bi, Bi] T/B2, By] [x]. (4.4.2) 


Hence, using (4.4.1) and (44.2), we see that for every x € V, 
I[B,, Bo| T[B1, Bi] 1[B2, Bi] [x]s, = T[B2, Be] [x]z,- 
Since the result is true for all x € V, we get 


I[B,, Bo| T[Bi, Bi] 1[B2, Bi] = T[B2, Ba). (4.4.3) 


That is, A7!'BA =C or equivalently ACA~! = B. 


Another Proof: 
Let B = [b;;| and C = [e;;]. Then for 1 <i<n, 


T (uj) = S- bjs; and T (vi) = S- GiV5. 


j=1 j=1 


So, for each 7,1 <j <n, 


T(vj) = Tvs) =TD_ anjue) = > anjT (ur) 


a arj(>~ bexUe) = bem be~Qn; Ue 
t=1 


k=1 f=1 k=1 
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and therefore, 


n 
YS bipan; 
eae a1; 
boK aK; a2; 
[T(vs)]a. = | #= =B 
Dy, Pnktig i 
k=1 


Hence T[B2, B,| = BA. 
Also, for each 7,1 <j <n, 


T(v;) = S- Chi Vk = > Chl (Ve) = yy cri (> aeRUe) 
k=1 k=1 f=1 


k=1 
= DOOD aencrg ue 
f=1 k=1 
and so 
SS aig 
k= C1; 
> Q2kCkj C23 
[T(vj)]B, = | k=1 =A 
n Cnj 
> OnkCkj 
k=1 
This gives us T[B2, B,] = AC. We thus have AC = T[Bo, Bi] = BA. | 


Let V be a vector space with dim(V) = n, and let T : V—>+V be a linear transformation. Then for 
each ordered basis 6 of V, we get an n x n matrix T[B, B]. Also, we know that for any vector space we 
have infinite number of choices for an ordered basis. So, as we change an ordered basis, the matrix of 
the linear transformation changes. Theorem [4.4.6] tells us that all these matrices are related. 

Now, let A and B be two n x n matrices such that P~!AP = B for some invertible matrix P. Recall 
the linear transformation T'4 : R"—>R” defined by T'4(x) = Ax for all x € R”. Then we have seen that 
if the standard basis of R” is the ordered basis B, then A = T4[B, 6]. Since P is an invertible matrix, 
its columns are linearly independent and hence we can take its columns as an ordered basis 6,. Then 
note that B = T4[B,, B,]. The above observations lead to the following remark and the definition. 


Remark 4.4.7 The identity (£43) shows how the matrix representation of a linear transformation T 
changes if the ordered basis used to compute the matrix representation is changed. Hence, the matrix 
I(B,, Bg] is called the B, : Bz change of basis matrix. 


Definition 4.4.8 (Similar Matrices) Two square matrices B and C' of the same order are said to be similar 
if there exists a non-singular matrix P such that B = PCP! or equivalently BP = PC. 


Remark 4.4.9 Observe that if A = T|B, 6] then 
{S-'AS : S isn x n invertible matrix } 


is the set of all matrices that are similar to the given matrix A. Therefore, similar matrices are just 
different matrix representations of a single linear transformation. 
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Example 4.4.10 — 1. Consider P2(R), with ordered bases 


B, =(1,l+a,1+a+27) and Bo=(1+a2—27,14+2r+27,2+2+42’). 


Then 
[l+a-27]p, =0-1+2-(1+a)+(-1)-(+2427) =(0,2,-1), 
[l+2e¢+27]5, =(-1)-1+1-(14+2)+1-(1+2+4+27) =(-1,1,1)*, and 
[2+a+27Jg,=1-14+0-(1+2)+1-(1+2+27) =(1,0,1). 
Therefore, 
T[B2,Bi) = [UG+ae-2°)]s,, 0+ 2e+27)]s,,[2+¢+2)]s,] 
= [-t+2-27]s,,[1+2¢+2"]s,,[2+¢+27]s,] 
0 -1 1 
= 2 1 O 
-1 1 1 


Find the matrices T[6,,6,| and T[Bz, Bz}. Also verify that 
T(B2,B2] = I[Bi, 82) T[Bi, Bi] T[B2, Bi] 
I~" |B, Bi] T[Bi, Bi] I[Bo, Bi). 


2. Consider two bases B, = ((1,0,0), (1,1,0), (1,1,1)) and By = ((1,1,-1), (1,2, 1), (2,1,1)) of R®. 
Suppose 7 : R?—+R? is a linear transformation defined by 


T((x, y, 2) = Styery + 22— 2), 


Then 
Oy 728 —4/5 1. 8/5 
T (Bi, Bi] = 1 1 4 5 and T (Bo, Ba] = —2/5 2 9/5 
70 8/5 0 —1/5 


Find I[B,, Bg] and verify, 
I(B), Bz] T(B1, 81] [[B2, 61] = T[Bo, Bo]. 


Check that, 
2 —-2 -2 
T(B1, By] [[B2, Bi] = I[Bo, Bi] T[B2,B2] = |-2 4 5 
2 1 0 


Exercise 4.4.11 1. Let V be an n-dimensional vector space and let T : V—+V bea linear transformation. 
Suppose J has the property that 7”~! 4 O but T” = O. 


(a) Then prove that there exists a vector u € V such that the set 
{u,T(u),..., 7" (u)} 


is a basis of V. 


(b) Let B = (u,T(u),...,7”"~*(u)). Then prove that 


00 0 = 0 
10 0 + 0 
T[B,B)=|0 1 0 + 0 
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(c) Let A be an n x n matrix with the property that A"~! 4 0 but A” = 0. Then prove that A is 


similar to the matrix given above. 


2. Let T : R?—>R? be a linear transformation given by 


T((z,y,2)) = (e@t+yt 22,0 —y — 32, dx + 3y + 2). 
Let B be the standard basis and 6; = (a 1,1),(1,-1,1), (1,1, 2)) be another ordered basis. 


(a) Find the matrices T[6, 6] and T[B,, By]. 
(b) Find the matrix P such that P~!T[B, B] P = T[Bi, Bi]. 


3. Let T : R°—>R3 be a linear transformation given by 
T((z,y,2)) =(@,e+y,2+y + 2). 
Let B be the standard basis and 6; = ((1, 0, 0), (1,1, 0), (1,1, 1)) be another ordered basis. 


(a) Find the matrices T[6, 6] and T[B,, By]. 
(b) Find the matrix P such that P~!T[B, B] P = T[Bi, Bi]. 
4. Let By = (G52; 0), (1,3, 2), (0, 1,3)) and By = G2; 1), (0, 1,2), (1,4, 6)) be two ordered bases of 
R°. 
Find the change of basis matrix P from B, to Bo. 


a 


c) Verify that PQ =I = QP. 


(a) 
(b) Find the change of basis matrix Q from By to By. 
(c) 
(d) 


Find the change of basis matrix from the standard basis of R® to B;. What do you notice? 


Chapter 5 


Inner Product Spaces 


We had learned that given vectors i and j (which are at an angle of 90°) in a plane, any vector in the 
plane is a linear combination of the vectors i and j. In this section, we investigate a method by which 
any basis of a finite dimensional vector can be transferred to another basis in such a way that the vectors 
in the new basis are at an angle of 90° to each other. To do this, we start by defining a notion of INNER 
PRODUCT (dot product) in a vector space. This helps us in finding out whether two vectors are at 90° 


or not. 


5.1 Definition and Basic Properties 


In R?, given two vectors x = (%1,%2), y = (y1, y2), we know the inner product x-y = x1y1 + L2y2. Note 

that for any x,y,z € R? and a € R, this inner product satisfies the conditions 
x-(y+az)=x-y+ax-z,x-y=y-x, and x-x>0 

and x-x = 0 if and only if x = 0. Thus, we are motivated to define an inner product on an arbitrary 


vector space. 


Definition 5.1.1 (Inner Product) Let V(F) be a vector space over F. An inner product over V (IF), denoted 
by (, ), is a map, 
(,) :VxV—-F 


such that for u,v,w € V anda,be F 
1. (au + bv, w) = a(u, w) + b(v, w), 
2. (u,v) = (v,u), the complex conjugate of (u,v), and 


3. (u,u) > 0 for all u € V and equality holds if and only if u = 0. 


Definition 5.1.2 (Inner Product Space) Let V be a vector space with an inner product ( , ). Then 
(V,(, )) is called an inner product space, in short denoted by IPs. 


Example 5.1.3 The first two examples given below are called the STANDARD INNER PRODUCT or the DOT 
PRODUCT on R” and C”, respectively.. 


1. Let V = R” be the real vector space of dimension n. Given two vectors u = (t1,U2,.-., Un) and 


v = (U1, U2,---,Un) of V, we define 
l 
(u,v) = U1V1 + UgQvg +++ + UnUn = UV". 


Verify (, ) is an inner product. 


Q7 
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2. Let V = C” be a complex vector space of dimension n. Then for u = (ui, U2,..-.,Un) and v = 
(v1, V2,...,Un) in V, check that 


(u,v) = uy + ugdy + +++ + UnT_ = uv* 


is an inner product. 


4 -1 
3. Let V = R? and let A = ; 7 . Define (x,y) = xAy*. Check that ( , ) is an inner product. 


Hint: Note that xAy* = 4a ,y, — x1y2 — Loy1 + 2reyo. 


4. let x = (#1, 02,273), y = (y1,Y2, ys) € R®., Show that (x,y) = l0x1yi + 3r1y2 + 3x2y1 + 2reyo + 
xoy3 + 23y2 + #3y3 is an inner product in R?(R). 


5. Consider the real vector space R?. In this example, we define three products that satisfy two conditions 


out of the three conditions for an inner product. Hence the three products are not inner products. 


(a) Define (x,y) = ((21, £2), (yi, y2)) = Xiy1. Then it is easy to verify that the third condition is 
not valid whereas the first two conditions are valid. 

(b) Define (x,y) = ((x1, 22), (y1, y2)) = x? + yf? + 23 + y3. Then it is easy to verify that the first 
condition is not valid whereas the second and third conditions are valid. 


(c) Define (x,y) = ((@1,22),(y1,y2)) = riy? + ray. Then it is easy to verify that the second 
condition is not valid whereas the first and third conditions are valid. 


Remark 5.1.4 Note that in parts 1 and 2 of Example [5.1.3| the inner products are uv’ and uv*, 
respectively. This occurs because the vectors u and v are row vectors. In general, u and v are taken as 


column vectors and hence one uses the notation u‘v or u*v. 
Exercise 5.1.5 Verify that inner products defined in parts 3 and 4 of Example[5.1.3] are indeed inner products. 


Definition 5.1.6 (Length/Norm of a Vector) For u € V, we define the length (norm) of u, denoted ||ul|, 
by ||u|] = ./(u, u), the positive square root. 


A very useful and a fundamental inequality concerning the inner product is due to Cauchy and 
Schwartz. The next theorem gives the statement and a proof of this inequality. 


Theorem 5.1.7 (Cauchy-Schwartz inequality) Let V(F) be an inner product space. Then for any u,v € 
V 
|(u,v)| < lull IIvll- 


The equality holds if and only if the vectors u and v are linearly dependent. Further, if u 4 O, then 
uu 
= (v, —)—. 
IJ * |u| 


Proor. If u=0, then the inequality holds. Let u 4 0. Note that (Au+ v,Au + v) > 0 for all A € F. 


In particular, for \ = at we get 
u 


0 < (Au+v,Au+v) 
= dAljull? + A(u,v) + A(v, u) + | v1? 


(v,u) + [lvl]? 
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Or, in other words 
I(v, u)|? < |lull? lvl? 
and the proof of the inequality is over. 


(v, u) 


|Jul|? 


Observe that if u 4 0 then the equality holds if and only of A\u+ v = 0 for A = — 


That is, u 
and v are linearly dependent. We leave it for the reader to prove 


u u 


v= (vy, Ta dal 


Definition 5.1.8 (Angle between two vectors) Let V be a real vector space. Then for every u,v € V, by 
the Cauchy-Schwartz inequality, we have 


eid ee _(u, Vv) <1 
~ full ivi] ~ 


We know that cos : [0,7] —> [—1, 1] is an one-one and onto function. Therefore, for every real number 
(u, v) 


Ta’ there exists a unique 0, 0 < @ < a, such that 
ull ||Vv 


cos 6 = sa 
I|ull Iivl| 
ae (u,v), 
1. The real number 6 with 0 < @ < 7 and satisfying cos@ = Tn is called the angle between the two 
ul| ||v 


vectors u and v in V. 
2. The vectors u and v in V are said to be orthogonal if (u,v) = 0. 


3. A set of vectors {uj,U2,...,Un} is called mutually orthogonal if (u;,u;) =0 for all l <i Agj<n. 


Exercise 5.1.9 1. Let {e,,e2,...,e,} be the standard basis of R”. Then prove that with respect to the 


standard inner product on R”, the vectors e; satisfy the following: 


(a) |le;|| =1 for l<i<n. 


(b) (e;,e;) =O forl<ifgj<n. 
2. Recall the following inner product on R?: for x = (21,22)' and y = (y1, ya)’, 
(x,y) = 4r1y1 — V1y2 — Tayi + 2zeyo. 


(a) Find the angle between the vectors e, = (1,0)' and eg = (0,1). 
(b) Let u = (1,0)*. Find v € R? such that (v, u) = 0. 


(c) Find two vectors x, y € R?, such that ||x|| = ||y|| = 1 and (x,y) =0. 
3. Find an inner product in R? such that the following conditions hold: 
I|(1,2)|] = [](2,-1)I|=1, and (1,2), (2,-1)) =0. 


a b 


[Hint: Consider a symmetric matrix A = hb . Define (x,y) = y' Ax and solve a system of 3 
c 


equations for the unknowns a, b,c.] 


90 


4. 


10. 
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Let V be a complex vector space with dim(V) = n. Fix an ordered basis B = (uj, u2,..., u,,). Define 


a map 


(,) :VxV—-C by (u,v) = So aid; 
i=1 


whenever [u]g = (@1,@2,..-,@n)’ and [v]g = (b1,b2,...,bn)*. Show that the above defined map is 
indeed an inner product. 


. Let x = (1, 22,73), y = (y1, y2, 3) € R°. Show that 


(x,y) = l0xyy1 + 3x yo + 38xQy1 + 2oyo + Loy3 + L3y2 + U3y3 


is an inner product in R3(R). With respect to this inner product, find the angle between the vectors 
(1,1,1) and (2, —5, 2). 


. Consider the set M,,x(R) of all real square matrices of order n. For A,B € Mn xn(IR) we define 


(A, B) = tr(AB*). Then 
(A+ B,C) =tr((A+ B)C’) = tr(AC*) + tr(BC") = (A, C) + (B,C). 
(A, B) = tr(AB’) = tr( (AB*)' ) = tr(BA‘) = (B, A). 
Let A = (a,;). Then 
(A, A) = tr(AA*) = da di = » 2 Ajj Aig = De ye ai, 


and therefore, (A, A) > 0 for all non-zero matrices A. So, it is clear that (A, B) is an inner product on 
Mnxn(R). 


. Let V be the real vector space of all continuous functions with domain [—27, 27]. That is, V = 


C|—2n, 27]. Then show that V is an inner product space with inner product fo f(x)g(a)dx. 


For different values of m and n, find the angle between the functions cos(mz) and sin(nx). 


. Let V be an inner product space. Prove that 


lu + v|| < |lul|/ + |lv|| for every u,v eV. 


This inequality is called the TRIANGLE INEQUALITY. 


. Let 21, 22,.--,2n € C. Use the Cauchy-Schwartz inequality to prove that 


Jza + 2a ++++ + 2n] SV n(l2za|? + [2a]? +--+ + [2n]?). 
When does the equality hold? 


Let x,y € R”. Observe that (x, y) = (y,x). Hence or otherwise prove the following: 


(a) (x,y) =0 <> ||x — y||? = ||xl? + |ly|]?, (This is called PYTHAGORAS THEOREM). 


(b) ||x|| = |ly|] —> («k+y,x-y) =0, (x and y form adjacent sides of a rhombus as the diagonals 
x+y and x — y are orthogonal). 


(c) |x + yl]? + |lx — yl]? = 2\|x||? + 2|ly||?, (This is called the PARALLELOGRAM LAW). 


(d) 4(x,y) = ||x + y||? — |x — y||? (This is called the POLARISATION IDENTITY). 


Remark 5.1.10 i. Suppose the norm of a vector is given. Then, the polarisation identity 


can be used to define an inner product. 
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ii. Observe that if (x,y) = 0 then the parallelogram spanned by the vectors x and y is a 
rectangle. The above equality tells us that the lengths of the two diagonals are equal. 


Are these results true if x,y € C"(C)? 
11. Let x,y € C”(C). Prove that 
(a) 4(x,y) = [lx + yl? — lx — yl? + éllx + éyl|? — a]lx — éyll?. 
(b) Ifx AOthen  ||x + éx||? = ||x||? + ||éx|]?, even though (x, ix) 4 0. 
(c) If |x + yl]? = |||]? + lly]? and |x + éyl|? = |lx||? + lléy||? then show that (x,y) = 0. 


12. Let V be an n-dimensional inner product space, with an inner product ( , ). Let u € V be a fixed 


vector with ||u|| = 1. Then give reasons for the following statements. 


(a) Let St ={veV : (v,u) =0}. Then S is a subspace of V of dimension n — 1. 
(b) LettO Aa@eFand let S={veV : (v,u) =a}. Then S is not a subspace of V. 


(c) For any v € S, there exists a vector vp € $+, such that v = vp + au. 
Theorem 5.1.11 Let V be an inner product space. Let {uj,u2,...,U,} be a set of non-zero, mutually 
orthogonal vectors of V. 


1. Then the set {uj, us,...,Up} is linearly independent. 


2. || s ayuyl|? = = |x|? || us|? 


3. Let dim(V) =n and also let ||u;|| = 1 for 2 = 1,2,...,. Then for any v € V, 


n 


v= Sow, u;)U;. 


i=1 


In particular, (v, u;) = 0 for all i = 1,2,...,n if and only if v = 0. 
ProoF. Consider the set of non-zero, mutually orthogonal vectors {uj, U2,...,U,}. Suppose there exist 
scalars c,,C2,...,Cn not all zero, such that 


cyuy + CgUg +++: + CpUyn = 0. 


Then for 1 <i<7n, we have 
n 
0 = (0, u;) = (cru, + cog +--+ + Cp Un, Ui) a uj, Ui) = G 
j=l 


as (u;,u;) = 0 for all 7 #7 and (u;,u;) = 1. This gives a contradiction to our assumption that some of 
the c,;’s are non-zero. This establishes the linear independence of a set of non-zero, mutually orthogonal 
vectors. 


0 if 
tar for 1 < 1,7 <n, we have 


For the second part, using (u;,u;) = ul? if ej 
7 


n n n 
| So awuil|? ee aju;, )> aju)) = You ws am) 
i=1 i=1 i=1 
n n n 
= Sa S asta) = Seana) 
t=1 j=l w=1 
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For the third part, observe from the first part, the linear independence of the non-zero mutually 


orthogonal vectors uj,U2,...,U,. Since dim(V) = n, they form a basis of V. Thus, for every vector 
v €V, there exist scalars a;, 1 <<i<n, such that v = ee. a;u,. Hence, 


n n 
(v,u;) = (S- Qj Uj, U;) = Sai (ui, wy) = Aj. 
i=1 i=1 


Therefore, we have obtained the required result. 


Definition 5.1.12 (Orthonormal Set) Let V be an inner product space. A set of non-zero, mutually or- 
thogonal vectors {vi,V2,...; Wn} in V is called an orthonormal set if ||v;|| = 1 for i =1,2,...,n. 

If the set {vi,V2,...,Vn} is also a basis of V, then the set of vectors {v1,v2,.-.,Vn} is called an 
orthonormal basis of V. 


Example 5.1.13 1. Consider the vector space IR? with the standard inner product. Then the standard 
1 1 
ordered basis 6 = ((1,0),(0,1)) is an orthonormal set. Also, the basis 6; = (—=(1,1), ~(1,-1 
is B = ((1,0),(0,1)) 1 = (Fx), 0-0) 


is an orthonormal set. 


2. Let IR” be endowed with the standard inner product. Then by Exercise the standard ordered 


basis (e1,€2,...,@n) is an orthonormal set. 


In view of Theorem [5.1.11] we inquire into the question of extracting an orthonormal basis from 
a given basis. In the next section, we describe a process (called the Gram-Schmidt Orthogonalisation 
process) that generates an orthonormal set from a given set containing finitely many vectors. 


Remark 5.1.14 The last part of the above theorem can be rephrased as “suppose {v1, V2,---,Vn} is 
an orthonormal basis of an inner product space V. Then for each u € V the numbers (u,v;) for1 <i<n 
are the coordinates of u with respect to the above basis”. 

That is, let B = (vi, v2,...,Vn) be an ordered basis. Then for any u € V, 


fuls = ((u, v1), (u, v2),..., (U, Vn))*. 


5.2 Gram-Schmidt Orthogonalisation Process 


Let V bea finite dimensional inner product space. Suppose uj, U2,..., Uy, is a linearly independent subset 
of V. Then the Gram-Schmidt orthogonalisation process uses the vectors U;,U2,...,U, to construct 
new vectors v1, V2,..-,Vn such that (v;,vj;) = 0 for i # J, ||v.|| = 1 and Span {w,uz,...,u;} = 
Span {v1,vo2,..., vi} for 7 =1,2,...,n. This process proceeds with the following idea. 

Suppose we are given two vectors u and v in a plane. If we want to get vectors z and y such that z 
is a unit vector in the direction of u and y is a unit vector perpendicular to z, then they can be obtained 
in the following way: 


u u,v 
Take the first vector z = Te Let 6 be the angle between the vectors u and v. Then cos(@) = Ina 
u ull |v 
u 
Defined a = ||v|| cos(@) = a = (z,v). Then w = v—a z@ is a vector perpendicular to the unit 
u 
vector z, as we have removed the component of z from v. So, the vectors that we are interested in are 
zand y = es, 
|| w| 


This idea is used to give the Gram-Schmidt Orthogonalization process which we now describe. 
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<v,u> 
|| u || 


Figure 5.1: Gram-Schmidt Process 


Theorem 5.2.1 (Gram-Schmidt Orthogonalization Process) Let V be an inner product space. Suppose 
{uj,U2,...,U,} is a set of linearly independent vectors of V. Then there exists a set {v1,V2,...,Vn} of 
vectors of V satisfying the following: 


1. |\v;|| =1 forl<i<n, 


2. (vi, vj) =0 for l<i,j <n, if JZ and 


3. L(vi,V2,---;Vi) = L(uy, ug,..., u;) for 1 < a < n. 
PROOF. We successively define the vectors v1,V2,...,Vn as follows. 
uy 
V1= 
I|us| 
W2 
Calculate w2 = ug — (Ue, v1)vi, and let v2 = ——. 
I|w2|| 
: W3 
Obtain W3 = U3 — (u3, V1) V1 aaa (U3, V2)V2, and let vV3>= Tws] 
W3 
In general, if v1, V2, V3, V4,---,Vj—1 are already obtained, we compute 
Wy = U; — (Uy, V1) V1 — (Uy, V2) V2 — ++ — (UG, Vi-1) Vi-1, (5.2.1) 
and define 
wi 
V46= —. 
Il wil 


We prove the theorem by induction on n, the number of linearly independent vectors. 
u 
For n = 1, we have v; = I T: Since u; 4 0, v; 40 and 
uy 


ui ui ) (u4, uy) 


fal? 


Ilvall? = (vi, vi) = ( 


Hence, the result holds for n = 1. 

Let the result hold for all k < n—1. That is, suppose we are given any set of k, 1< k<n-1 
linearly independent vectors {uj, u2,...,ux} of V. Then by the inductive assumption, there exists a set 
{vi,V2,..., Vx} of vectors satisfying the following: 


1. ||v;|| =1 for 1<i<k, 


2. (vi, vj) =0 for 1<i#j <k, and 
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3. L(vi,V2,---;Vi) = L(uy, ug,..., u;) for 1 < a < k. 
Now, let us assume that we are given a set of n linearly independent vectors {uj,U2,...,Un,} of V. 
Then by the inductive assumption, we already have vectors vj, V2,...,Vn—1 Satisfying 


1. |lv;|| =1 for l<i<n-1, 
2. (vi, vj) =0 for 1 <i fj <n-—1, and 
3. L(vi,Ve,.--, Vi) = D(wi, ue,...,u;) for 1 <i<n-1. 


Using (5.2.1), we define 


Wn = Un — (Un, V1) V1 = (Un, V2) V2 SSeS (Un, Vn—1)Vn—1- (5.2.2) 
w 
We first show that w,, ¢ L(v1,v2,.--,Vn—1)-. This will also imply that w,, 4 0 and hence v,, = I 7 
Wn 
is well defined. 
On the contrary, assume that w,, € L(vi,v2,..-,Vn—1). Then there exist scalars a1, @2,...,Qn—1 
such that 
Wn = Q1V1 + AQV2 +°++ + An—-1Vn-1- 
So, by (5.2.2) 
Un = (ay + (Un, V1)) V1 =F (a2 =+ (un, V2))V2 a ee ((Qn—1 =F (ii Vinod) ) Vai 
Thus, by the third induction assumption, 
Un € L(vi, V2, Pee ,Vn-1) = L(uy, Ug,..-,5 Un—1)- 
This gives a contradiction to the given assumption that the set of vectors {u1,U2,...,U,} is linear 
independent. 
Ww 
So, w, #0. Define v,, = [well Then ||v,,|| = 1. Also, it can be easily verified that (v,,v;) = 0 for 
Wn 


1<i<n-1. Hence, by the principle of mathematical induction, the proof of the theorem is complete. 


We illustrate the Gram-Schmidt process by the following example. 


Example 5.2.2 Let {(1,—1,1,1), (1,0,1,0), (0,1,0,1)} be a linearly independent set in R*(IR). Find an 
orthonormal set {v1, V2, v3} such that L( (1,—1,1,1),(1,0,1,0), (0,1,0,1) ) = L(v1, vo, v3). 


1,0,1 
Solution: Let u; = (1,0,1,0). Define v, = atm Let ug = (0,1,0,1). Then 


(1,0, 1,0) 
wee 00.1) (010, 0, = 0,00. 
2 = (0,1,0,1) = (0,1,0,1), 2 yvy = (0,1,0,1) 
1,0,1 
Hence, v2 = AGL O52) Let us = (1,—1,1,1). Then 
J2 
wa = (1,-1,1,1)—(0,-1,1,), 282), - 11,9, C22, 
i 
= (0,-—1,0,1) 
—1,0,1 
and v3 = DLO) 
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Remark 5.2.3 1. Let {uj,Ug,...,ux} be any basis of a k-dimensional subspace W of R". Then by 
Gram-Schmidt orthogonalisation process, we get an orthonormal set {v1, V2,...,Ve} C R” with 
W = L(v1,vo2,..-, Ve), and forl <i<hk, 


L(vi, v2,---, Vi) = L(u1, ue,..., ui). 


2. Suppose we are given a set of n vectors, {u1,U2,...,U,} of V that are linearly dependent. Then 
by Corollary|3.2.5| there exists a smallest k, 2 << k <n such that 


L(uy,ug,...,uz) = L(uj, ug,..., ux—1). 


We claim that in this case, wy = 0. 


Since, we have chosen the smallest k satisfying 
L(uy, Ug,..+, u;) = L(uy, Ug,-.+; uj_-1), 


for 2 <i<n, the set {uj,u2,...,ux—1} is linearly independent (use Corollary [3.2.5). So, by 
Theorem there exists an orthonormal set {vi,V2,..-,Vx—1} such that 


L(uy, Ug,..-, Uz_1) = L(vi, va, ee ,Vk-1)- 
As ux € L(vi,V2,.--,Ve—-1), by Remark[5.1.14 
ug = (Ug, V1) V1 + (Ug, V2) V2 ++ °° + (UR, Ve—-1) Vn-1- 


So, by definition of wr, wx, = 0. 


Therefore, in this case, we can continue with the Gram-Schmidt process by replacing ux by ug+1. 


3. Let S be a countably infinite set of linearly independent vectors. Then one can apply the Gram- 
Schmidt process to get a countably infinite orthonormal set. 


4. Let {v1,V2,...,v%} be an orthonormal subset of R". Let B = (e1,€2,...,@n) be the standard 
ordered basis of R". Then there exist real numbers aij, 1 <i <k, 1 <j <n such that 


t 
[vile = (a1, OQin++ Ani) . 
Let A = [v1, v2,...,V%]. Then in the ordered basis B, we have 
Q11 12 Qik 
Q21 M22 A2k 
A — 
Anil An2 Onk 


isann xk matrix. 


Also, observe that the conditions ||v;|| = 1 and (v;,v;) =0 for 1 <i#¢j <n, implies that 


1 = ||vill = lIvill? = (vi, va) = DO 0%, 
i-8 (5.2.3) 


n 
and 0= (vi, Vj) = do asids;. 
s=1 
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Note that, 

vi [Vi,V2,-.-, Ve] Iva]? (Vi,v2) ++: (V1, VE) 
; V5 (v2,v1) — |lvall? +++ (va, ve) 
KA. = 1. = 

Vie (VeVi) (Vk,v2) ++: [Ivell? 

1 O 0 

0 1 0 

= ; = Ih. 
0 0 1 


Or using (5.2.3), in the language of matrices, we get 


O41 G21 c'* Ani Q11 O12 ""* Qik 

: Q12 422 *"* QAn2 21 O22 ""* 2k 
A A=] , Lo, : : . ; . | =r. 

1k A2k noes Onk Ant An2 vet OAnk 


Perhaps the readers must have noticed that the inverse of A is its transpose. Such matrices are called 
orthogonal matrices and they have a special role to play. 


Definition 5.2.4 (Orthogonal Matrix) A n x 7 real matrix A is said to be an orthogonal matrix if A At = 
AA=I,. 


It is worthwhile to solve the following exercises. 


Exercise 5.2.5 1. Let A and B be two n x n orthogonal matrices. Then prove that AB and BA are 
both orthogonal matrices. 


2. Let A be an n x n orthogonal matrix. Then prove that 


(a) the rows of A form an orthonormal basis of R”. 
(b 
(c 


(d) for any vector x € R”*?!, || Ax|| = ||x|). 


the columns of A form an orthonormal basis of R”. 
for any two vectors x,y € R”*!, (Ax, Ay) = (x,y). 


) 
) 
) 
) 


3. Let {u,ug,...,U,} be an orthonormal basis of R”. Let B = (e1,€2,...,€n) be the standard basis of 
R”. Construct an n x n matrix A by 


411 G12 *'' Gin 

421 422 ‘"* Q2n 
A = [uy, Us, +, Un} = 

An aAn2 ei Ann 


where 


Uu; =>  aye;, for 1 ot a < n. 


Prove that A’A = I,,. Hence deduce that A is an orthogonal matrix. 


4. Let A be an n X 1 upper triangular matrix. If A is also an orthogonal matrix, then prove that A = J. 
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Theorem 5.2.6 (QR Decomposition) Let A be a square matrix of order n. Then there exist matrices Q 
and R such that Q is orthogonal and RF is upper triangular with A = QR. 
In case, A is non-singular, the diagonal entries of R can be chosen to be positive. Also, in this case, the 


decomposition is unique. 


PRooF. We prove the theorem when A is non-singular. The proof for the singular case is left as an 


exercise. 
Let the columns of A be xj, X2,..-,Xn- The Gram-Schmidt orthogonalisation process applied to the 
vectors X1,X2,-.-.-,Xn gives the vectors u,, Uo,...,U, satisfying 
Duy, ug,...,u;) = D(x, x2,..., x4 
ABE Maeno g a) => EU ayer i): forl<i#j<n. (5.2.4) 
ljus|| =1, (ui, uj) = 0, 


Now, consider the ordered basis B = (uj, ug,..., Un). From (6.2.4), for 1 < i <n, we have L(uy, ue,...,u;) = 
L(x1,X2,...,X;). So, we can find scalars aj;,1 <j <7 such that 


X; = Qayju, + Q2;U2 +--+ ayju; = [(ani, sae y iG, QO... OY" es (5.2.5) 


Let Q = [ui, ug,...,U,]. Then by Exercise 5.2.5]3] Q is an orthogonal matrix. We now define an n x n 
upper triangular matrix R by 


Q11 12 "** Qin 
O ag2 ++: Qan 
R= ; 
0 0 Ann 
By using (5.2.5), we get 
Q11 12 Ain 
0 age Q2n 
QR = [u1,Ug,..., Un] 
0 Qo abe Hien 
n 
= | 414, A12U1 + A2Q2Ug,..., 5 aint 
i=1 
= [X1,X2,-..,Xn] = A. 


Thus, we see that A = QR, where Q is an orthogonal matrix (see Remark [5.2.3]4) and R is an upper 
triangular matrix. 

The proof doesn’t guarantee that for 1 <i <n, ay is positive. But this can be achieved by replacing 
the vector u; by —u; whenever a;; is negative. 

Uniqueness: suppose Q;R, = Q2Re then Os Gi = R2R;'. Observe the following properties of 


upper triangular matrices. 
1. The inverse of an upper triangular matrix is also an upper triangular matrix, and 
2. product of upper triangular matrices is also upper triangular. 


Thus the matrix R2R,‘ is an upper triangular matrix. Also, by Exercise the matrix Q5'Q1 is 
an orthogonal matrix. Hence, by Exercise 5.2.514) RoR7* = In. So, Ro = Ri and therefore Qz = Q1. 


Suppose we have matrix A = [x),xX2,...,xXx] of dimension n x k with rank (A) = r. Then by Remark 
[5.2.32] the application of the Gram-Schmidt orthogonalisation process yields a set {uj, U2,...,u,} of 
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orthonormal vectors of R”. In this case, for each i, 1 <7<r, we have 

L(uy, ug,...,u;) = L(x1,X2,...,x;), forsome 7, i<j <k. 
Hence, proceeding on the lines of the above theorem, we have the following result. 


Theorem 5.2.7 (Generalised QR Decomposition) Let A be an n x k matrix of rank r. Then A = QR, 
where 


1. Q is ann xr matrix with QQ = I,. That is, the columns of Q form an orthonormal set, 
2. If Q = [u1, U2,...,u,], then L(uy, ue,...,u,) = D(x1,x2,...,Xx), and 


3. Ris anr x k matrix with rank (R) =r. 


10 1 2 

0 1 -1 1 ; . ‘ 
Example 5.2.8 1. Let A = tae a ale Find an orthogonal matrix @ and an upper triangular 

01 1 1 


matrix R such that A = QR. 
Solution: From Example we know that 


1 1 
VMi= — (1,0, 1,0), er — 
V2 


1 
(0,1,0,1), v3 = —=(0,—-1,0, 1). (5.2.6) 
J2 


v2 


We now compute wg. If we denote uy = (2,1, 1,1)’ then by the Gram-Schmidt process, 


Ww4 = U4— (U4, V1) V1 = (u4, V2) V2 _ (U4, V3) V3 
1 
ee 5 (1,0; —1,0)*. (5.2.7) 


Thus, using (5.2.6) and (6.2.7), we get 


1 1 
wz% OF 
(<= 2 Gg 
Q = [vi, V2, V3, v4] = 1 a: _ a} 
V2 V2 
1 1 
. 2 ee 2 
and 
3 
v2 0 v2 a 
rp-|® v2 0 v2 
~ 1 OF af2 O 
-1 
0 005 


The readers are advised to check that A = QR is indeed correct. 


1 1 1 =O 

-1 0 -2 . . ae : ‘ 
2. Let A= i #4 . Find a 4x 3 matrix Q satisfying Q¢Q = Iz and an upper triangular matrix 

1 @ 2: 1 


R such that A = QR. 

Solution: Let us apply the Gram Schmidt orthogonalisation to the columns of A. Or equivalently to the 
rows of A’. So, we need to apply the process to the subset {(1, —1, 1, 1), (1,0, 1,0), (1, -2, 1,2), (0,1,0,1)} 
of R*. 
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Let u; = (1,—1,1,1). Define v; = > Let ug = (1,0,1,0). Then 


az ’ ? 


1 
W2 = (1,0, 1,0) aa (U2, V1) V1 = (1,0, 1,0) -V= ia 1,1 —1). 


Gg iy Lie 1) 


Hence, v2 = 5 


. Let us = (1, —2,1,2). Then 
W3 = U3 — (U3, V1) V1 — (u3, V2) V2 = U3 — 3v1 +vVv2= 0. 
So, we again take u3 = (0,1,0,1). Then 


W3 = U3 — (u3, V1) V1 = (ug, V2) V2 = Ug — Ovy fa Ove = Ug. 
(0, 1,0, 1) 
v2 


So, v3 = . Hence, 


Bt ee. 
, and R=|0 1 -1 0 
00 0 V2 


Q = [v1, Vo, V3] = 


NIF NI Si Nik 
BR 
Be i 
SI os i) 


| | NIB DR Ne 
BR 


The readers are advised to check the following: 

(a) rank (A) =3, 

(b) A= QR with Q‘Q = Is, and 

(c) Ra3 x 4 upper triangular matrix with rank (R) = 3. 


Exercise 5.2.9 1. Determine an orthonormal basis of R* containing the vectors (1, —2, 1,3) and (2, 1, —3, 1). 


2. Prove that the polynomials 1, x, 3x? - 4, 323 = 3a form an Yom set of functions in the in- 
ner product space C[—1,1] with tae inner product (f,g) =f), f(t) g(t)dt. Find the corresponding 


functions, f(a) with || f(«)|| = 1. 


3. Consider the vector space C[—7, 7] with the standard inner product defined in the above exercise. Find 
an orthonormal basis for the subspace spanned by a, sinx and sin(a + 1). 


4. Let M be a subspace of R” and dim M = m. A vector « € R” is said to be orthogonal to M if 
(x,y) = 0 for every y € M. 
(a) How many linearly independent vectors can be orthogonal to MM? 
(b) If M = {(21, 22,73) € R®?: 21 + 2 + x3 = 0}, determine a maximal set of linearly independent 


vectors orthogonal to M in R°. 


5. Determine an orthogonal basis of vector subspace spanned by 
ee 1,0, 1); (S 1, 1, 1; =), (0, 2, 1,0), (1,0, 0,0)} in R*. 


6. Let S = {(1,1,1,1), (1, 2,0, 1), (2,2,4,0)}. Find an orthonormal basis of L(S') in R*. 


7. Let R” be endowed with the standard inner product. Suppose we have a vector x’ = (21, 22,..-,%n) € 
IR”, with ||x|| = 1. Then prove the following: 
(a) the set {x} can always be extended to form an orthonormal basis of R”. 
(b) Let this basis be {x,x2,...,Xn}. Suppose B = (e1,€2,...,€,,) is the standard basis of IR”. Let 


= |[x]e, [xa]e, ..-, [Xn]e|. Then prove that A is an orthogonal matrix. 


8. Let v,w € R”,n > 1 with |/ul| = ||w|| = 1. Prove that there exists an orthogonal matrix A such that 
Av =w. Prove also that A can be chosen such that det(A) = 1. 
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5.3 Orthogonal Projections and Applications 


Recall that given a k-dimensional vector subspace of a vector space V of dimension n, one can always 
find an (n — k)-dimensional vector subspace Wo of V (see Exercise 3.3.1919) satisfying 


W+Wo=V and WNW = {0}. 


The subspace Wg is called the complementary subspace of W in V. We now define an important class of 


linear transformations on an inner product space, called orthogonal projections. 


Definition 5.3.1 (Projection Operator) Let V be an n-dimensional vector space and let W be a k- 
dimensional subspace of V. Let Wo be a complement of W in V. Then we define a map Pw : V — V 
by 

Pw(v) =w, whenever v=w-+ wo, w€ W, wo € Wo. 


The map Py is called the projection of V onto W along Wo. 

Remark 5.3.2 The map P is well defined due to the following reasons: 
1. W+W, = V implies that for every v € V, we can find w € W and wo € Wo such that v = w+ wo. 
2. WW, = {0} implies that the expression v = w + wo is unique for every v € V. 


The next proposition states that the map defined above is a linear transformation from V to V. We 
omit the proof, as it follows directly from the above remarks. 


Proposition 5.3.3 The map Pw : V —> V defined above is a linear transformation. 


Example 5.3.4 Let V = R? and W = {(2,y,z) € R8: e¢+y—2z=0}. 


1. Let Wo = L( (1, 2,2) ). Then WM Wo = {0} and W + Wo = R®. Also, for any vector (x,y,z) € R°, 


note that (x, y, 2) = w + wo, where 


w = (z—y, 22 — 2a — y, 32 — 24 — 2y), and wo = (x+y — z)(1, 2, 2). 


So, by definition, 


O -1 1} Ja 
Pw((,y,2)) = (2 —y, 22 — 2x —y,3z-—2%—-2y)=]-2 -1 2 
—2 -—2 3] |z 


2. Let Wo = L( (1,1, 1) ). Then WM Wo = {0} and W + Wo = R®. Also, for any vector (2, y, z) € R°, 


note that (x, y, 2) = w + wo, where 


w=(z-y,2-2,2z-x“-y), and wo =(a#+y-2z)(1,1,1). 


So, by definition, 


0 -1 1 x 
Pw((2,y,z)) =(¢-y,2-2,22-2-y)=]-1 0 11] ly 
—1 -1 2 Zz 


Remark 5.3.5 1. The projection map Py depends on the complementary subspace Wo. 


2. Observe that for a fixed subspace W, there are infinitely many choices for the complementary 
subspace Wo. 
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3. It will be shown later that if V is an inner product space with inner product, ( , ), then the subspace 
Wo is unique if we put an additional condition that Wo = {v € V : (v,w) =0 forall we W}. 


We now prove some basic properties about projection maps. 


Theorem 5.3.6 Let W and Wp be complementary subspaces of a vector space V. Let Pw : V —> V bea 
projection operator of V onto W along Wo. Then 


1. the null space of Py, N(Pw)={v eV: Pw(v) =0} = Wo. 
2. the range space of Py, R(Pw) = {Pw(v):veEV}=W. 
3. Pi, = Pw. The condition PZ, = Py is equivalent to Pw (I — Pw) =0 = (I — Pw)Pw. 


PRooF. We only prove the first part of the theorem. 

Let wo € Wo. Then wo = 0+ wo for 0 € W. So, by definition, P(wo) = 0. Hence, Wo C N' (Py). 
Also, for any v € V, let Pw(v) = 0 with v = w+ wo for some wo € Wo and w € W. Then by 

definition 0 = Pw(v) = w. That is, w = 0 and v = wo. Thus, v € Wo. Hence V(Pyw) = Wo. 


Exercise 5.3.7 1. Let A be an n x n real matrix with A? = A. Consider the linear transformation 
Ta : R” — R", defined by T'4(v) = Av for all v € R”. Prove that 
(a) T40T4 = Ta (use the condition A? = A). 
(b) N(Z4) 1 R(La) = {0}. 
Hint: Letx € N(Ta) A R(T4). This implies T4(x) = 0 andx = Ta(y) for some y € R”. So, 
x = Ta(y) = (Tao Ta)(y) = Ta(Ta(y)) = Ta(x) = 0. 


(c) R® =N(T4) + R(Ta). 
Hint: Let {vi,..., vx} be a basis of N(T4). Extend it to get a basis {v1,...,Vk;Vk+1;--+; Vn} 
of R". Then by Rank-nullity Theorem|Z.3.6, {Ta(ve+1),---,TA(vn)} is a basis of R(T). 


(d) Define W = R(T4) and Wo = N(T4). Then Ty is a projection operator of R” onto W along 
Wo. 
Recall that the first three parts of this exercise was also given in Exercise[4.3. 1017] 


2. Find all 2 x 2 real matrices A such that A? = A. Hence or otherwise, determine all projection operators 
of R?. 


The next result uses the Gram-Schmidt orthogonalisation process to get the complementary subspace 
in such a way that the vectors in different subspaces are orthogonal. 


Definition 5.3.8 (Orthogonal Subspace of a Set) Let V bean inner product space. Let S be a non-empty 
subset of V. We define 
St={veV :(v,s) =0 for alls S}. 


Example 5.3.9 Let V =R. 
1. S= {0}. Then $+ = 
2. S=R, Then S+ = {0}. 


3. Let S be any subset of R containing a non-zero real number. Then S+ = {0}. 
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Theorem 5.3.10 Let S be a subset of a finite dimensional inner product space V, with inner product (, ). 
Then 


1. S+ is a subspace of V. 


2. Let S be equal to a subspace W. Then the subspaces W and W+ are complementary. Moreover, if 
wéW anduc Wt, then (u,w) =0 and V=W+W?. 


PROOF. We leave the prove of the first part for the reader. The prove of the second part is as follows: 
Let dim(V) = n and dim(W) = k. Let {wi,wo,...,wx} be a basis of W. By Gram-Schmidt orthogo- 


nalisation process, we get an orthonormal basis, say, {v1, v2,...,V%} of W. Then, for any v € V, 
k 
v— So (v, va)vi ewe. 
i=l 


So, V CW+W4. Also, for any ve WNW24, by definition of W+, 0 = (v,v) = ||v||?. So, v = 0. That 
is, WNW? = {0}. 


Definition 5.3.11 (Orthogonal Complement) Let W be a subspace of a vector space V. The subspace 
W+ is called the orthogonal complement of W in V. 


Exercise 5.3.12 1. Let W = {(z,y,z) € R?: cx +y+ z= 0}. Find W+ with respect to the standard 
inner product. 
2. Let W be a subspace of a finite dimensional inner product space V. Prove that (W+)+ = W. 


3. Let V be the vector space of all mn x n real matrices. Then Exercisd5.1.9]6] shows that V is a real 
inner product space with the inner product given by (A, B) = tr(AB*). If W is the subspace given by 
W={AeV: At= A}, determine W-. 


Definition 5.3.13 (Orthogonal Projection) Let W be a subspace of a finite dimensional inner product 
space V, with inner product (, ). Let W+ be the orthogonal complement of W in V. Define Py : V —> V 
by 

Pw(v) =w where v=w+u, with we W, and uc Wt. 


Then Py is called the orthogonal projection of V onto W along W+. 


Definition 5.3.14 (Self-Adjoint Transformation/Operator) Let V be an inner product space with inner 
product (, ). A linear transformation T : V —+ V is called a self-adjoint operator if (J'(v),u) = (v, T(u)) 


for every u,v € V. 


Example 5.3.15 = 1. Let A be ann xn real symmetric matrix. That is, At = A. Then show that the linear 
transformation T4 : R” —> R" defined by T4(x) = Ax for every x* € R” is a self-adjoint operator. 
Solution: By definition, for every x’, y’ € R”, 


(Ta(x),y) = (y)'Ax = (y)’A’x = (Ay)'x = (x, Ta(y)). 
Hence, the result follows. 


2. Let A be an n xX n Hermitian matrix, that is, A* = A. Then the linear transformation 74 :C” —> C” 
defined by T'4(z) = Az for every z' € C” is a self-adjoint operator. 


Remark 5.3.16 = 1. By Proposition[5.3.3| the map Py defined above is a linear transformation. 
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2. P2, = Pw, (I— Pw) Pw =0 = Pw(I—- Py). 


3. Let u,v € V with u = u; + us and v = v1 + vo for some u;, v1 € W and us, v2 € Wt. Then we 


know that (u;,v;) =0 whenever 1 <i# Jj < 2. Therefore, for every u,v € V, 


(Pw (u), v) = (u1, Vv) = (ui, V1 + v2) => (W1, V1) = (uy + Uz, V1) 


= (u, Pyw(v)). 
Thus, the orthogonal projection operator is a self-adjoint operator. 


4. Letv € V and w € W. Then Py(w) = w for all w € W. Therefore, using Remarks [5.3. 162] and 
we get 


(v — Pw(v),w) (UZ = Pw) (v), Pw(w)) = (Pw (I a Pw) (v), w) 
= (0(v),w) =(0,w) =0 


for every w € W. 
5. In particular, (v — Pw(v), Pw(v) — w) = 0 as Pw(v) € W. Thus, (v — Pw(v), Pw(v) — w’) = 0, 
for every w' € W. Hence, for any v € V and w € W, we have 
lv—wl? = |lv— Pw(v) + Pw(v) - wl? 
= |lv— Pw)? + Pw (v) — wll? 


+2(v = Pw(v), Pw(v) — w) 


= IIlv— Pw(v)I? + | Pw(v) - wll’. 


Therefore, 


|v — wl 2 Ilv— Pw(v)|| 


and the equality holds if and only if w = Pyw(v). Since Pyw(v) € W, we see that 
d(v,W) = inf {||v—w]| : we W} =||v— Pyw(v)|). 


That is, Py (v) is the vector nearest to v € W. This can also be stated as: the vector Py(v) solves 
the following minimisation problem: 


iat, lly — wil = Ilv — Pw) I. 


5.3.1 Matrix of the Orthogonal Projection 


The minimization problem stated above arises in lot of applications. So, it will be very helpful if the 
matrix of the orthogonal projection can be obtained under a given basis. 

To this end, let W be a k-dimensional subspace of R” with W+ as its orthogonal complement. Let 
Py : R” — R” be the orthogonal projection of R” onto W. Suppose, we are given an orthonormal 
basis B = (v1, Vv2,...,V%) of W. Under the assumption that B is known, we explicitly give the matrix of 
Pw with respect to an extended ordered basis of R”. 


Let us extend the given ordered orthonormal basis B of W to get an orthonormal ordered basis 
n 


By = (vi, V2,---,Vk,Ve4+1---;Wn) of R”. Then by Theorem [5.1.11] for any v € R”, v = So (v, vi)vi. 


i=1 


k 
Thus, by definition, Pw(v) = >> (v,vi)v;. Let A = [vi,v2,...,vx]. Consider the standard orthogonal 
i=1 
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n 


ordered basis Bp = (€1,€2,...,€n) of R”. Therefore, if v; = a,;iej, for 1<i<k, then 
j=l 
Dd ai(v, vi) 
Q@i1 G12 +++ Atk 71 
Q21 G22 *** Gar = a2i(v, Vi) 
A= : é 2 A ’ [v]B5 == 
QAnl1 aAn2 baie Ank 3 anilv, vi) 
i=1 
and 
k 
d aii(v, vi) 
4=1 
k 
Y a2i(v, vi) 
[Pw (v)|B2 = | *=1 
k 
d= ani(v, Vi) 
i=1 
Then as observed in Remark A‘A = I. That is, for 1 < i,j <k, 
” 1 ifi=j 
S- asidsi = sg? 3 . (5.3.1) 
reer 0 if iF. 


Thus, using the associativity of matrix product and (5.3.1), we get 


3 ari(V, Vi) 


Q@i1 421 t+ Ant i=1 


(AA)(v) = A i ” 7 . ¥ anv. vi) 
ees ay i ‘lv 
= si (= auivivs)) i e anasi) (v, vi) 
= if Lax (3 anv.) a, > (35 ana.) Gx) 
Ean (Sauwwo)| [5 (S anan) (wv 
v,vi) 2 aulvve) 
ss af a : ¥ anv.) 
(Vv, Vk) k | 
2 ani(v, vi) 
= [Pw(v)]s2- 


Thus Pyw[B2, Bz] = AA*. Thus, we have proved the following theorem. 


Theorem 5.3.17 Let W be a k-dimensional subspace of IR” and let Py : R” —> R” be the orthogonal 
projection of R” onto W along W+. Suppose, B = (v1, V2,-.-,Vx) is an orthonormal ordered basis of W. 
Define an nx k matrix A = [v1, V2,..., Vx]. Then the matrix of the linear transformation Py, in the standard 
orthogonal ordered basis (€1,€2,.--,@n) is AA®. 
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Example 5.3.18 Let W = {(x,y,z,w) € R* : x = y,z = w} be a subspace of W. Then an orthonormal 
ordered basis of W is 


1 1 
— (1,1, 0,0), z(0,0,1,1 ’ 
(g(t 10,0), (0,041, 1) 
and that of W+ is ; ‘ 
Set IO ie A) 
(Fy (ts-1,0,0), 5 (0,0,1,-1) 


Therefore, if Py : R* —+ R? is an orthogonal projection of R* onto W along W+, then the corresponding 
matrix A is given by 


all oe 


Hence, the matrix of the orthogonal projection Py in the ordered basis 


1 


1,-1,0,0 aa 
( 5 


1 
a (0,0, 1, -1)) 


Pyw(|B, B] = AAt = 


Oo OC NF NF 
Oo CONF NIF 
NIF NIF GO © 
NIE NIE GO © 


It is easy to see that 
1. the matrix Py [B, 6] is symmetric, 
2. Pw[B, B]? = Pw[B, Bl, and 
3. (4 — Pw[B, B]) Pw [B, B] = 0 = Pyw[B, B\ (4 — Pw[B, 8). 


Also, for any (a, y, 2, w) € R*, we have 


( ) (ee y 2 “) 
xz, 5%, W on ry a i a LOY a Yo = 
e V2 Jf2° J2/° V2 

_@t+ry zZ+w 


Thus, Pw ((x,y,z,w)) = 5 (1,1,0,0)+ 


vector (x,y, z,w) € R*. 


(0,0, 1,1) is the closest vector to the subspace W for any 


Exercise 5.3.19 1. Show that for any non-zero vector v’ € R”, the rank of the matrix vv° is 1. 


2. Let W be a subspace of a vector space V and let P : V —+ V be the orthogonal projection of V 
onto W along W +. Let B be an orthonormal ordered basis of V. Then prove that corresponding matrix 
satisfies P[B, B]' = P[B, B}. 


3. Let A be an n x n matrix with A? = A and At = A. Consider the associated linear transformation 
T4 : R" — Rt” defined by T4(v) = Av for all v' € R”. Then prove that there exists a subspace W 
of R” such that 7’, is the orthogonal projection of R” onto W along W+. 


4. Let W; and W2 be two distinct subspaces of a finite dimensional vector space V. Let Py, and Py, 
be the corresponding orthogonal projection operators of V along Wj and W5-, respectively. Then by 
constructing an example in R?, show that the map Py, o Py, is a projection but not an orthogonal 
projection. 
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5. Let W be an (n—1)-dimensional vector subspace of R” and let W+ be its orthogonal complement. Let 
B = (vi,V2,---;Vn—1,Vn) be an orthogonal ordered basis of R” with (v1, v2,...,Vn—1) an ordered 


basis of W. Define a map 
T:R" —R” byT(v) =wo-w 


whenever v = w+ wo for some w € W and wo € Wt. Then 


(a) prove that T is a linear transformation, 
(b) find the matrix, T[B, 6], and 


(c) prove that T[B, 6] is an orthogonal matrix. 


T is called the reflection along W+. 


Chapter 6 


Eigenvalues, Eigenvectors and 


Diagonalization 


6.1 Introduction and Definitions 


In this chapter, the linear transformations are from a given finite dimensional vector space V to itself. 
Observe that in this case, the matrix of the linear transformation is a square matrix. So, in this chapter, 
all the matrices are square matrices and a vector x means x = (#1, %2,...,2n)' for some positive integer 


n. 
Example 6.1.1 Let A be a real symmetric matrix. Consider the following problem: 
Maximize (Minimize) x’ Ax such that x € R” and x'x = 1. 


To solve this, consider the Lagrangian 


L(x, ) = x'Ax—Ax'x—1) = DTD TF asgniny — ACD a? — 1). 


i=1 j=1 i=1 


Partially differentiating L(x, A) with respect to x; for 1 <i <n, we get 


OL 
z= = 201121 a ie 2a12X2 ae 2d1nXLn as 2X1, 
Ox 
OL 
= = 202121 Tr 2a22X2 eget 2d2nXLn —_ 2X2, 
0x2 
and so on, till 
L 
2 = 2an1 21 + 2angXe +--+ + 2dnn Fn — 2A. 
Ofn 
Therefore, to get the points of extrema, we solve for 
OL OL OL OL 
0,0,...,0)* = (—, —,..., ——)’ = — = 2(Ax — Ax). 
(0; 0,205.0) (am? Ban? Ba,’ x (AX — AX) 


We therefore need to find a \ © R and 0 4x € R” such that Ax = Ax for the extremal problem. 


Example 6.1.2 Consider a system of n ordinary differential equations of the form 


w ott = Ay, t>0; (6.1.1) 
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where A is a real n x n matrix and y is a column vector. 
To get a solution, let us assume that 
y(t) = ce (6.1.2) 


is a solution of and look into what A and c has to satisfy, 7.e., we are investigating for a necessary 
condition on A and c so that is a solution of (6.1.1). Note here that has the zero solution, 
namely y(t) = 0 and so we are looking for a non-zero c. Differentiating with respect to ¢ and 
substituting in (6.1.1), leads to 


de*'e = Ae™e or equivalently (A — \I)c = 0. (6.1.3) 


So, (6.1.2) is a solution of the given system of differential equations if and only if X and c satisfy (6.1.3). 
That is, given an 2 x n matrix A, we are this lead to find a pair (A, c) such that c 4 O and (6.1.3) is satisfied. 


Let A be a matrix of order n. In general, we ask the question: 
For what values of  € F, there exist a non-zero vector x € F” such that 


Ax = \x? (6.1.4) 


Here, F” stands for either the vector space R” over R or C” over C. Equation (6.1.4) is equivalent to 
the equation 
(A— AI)x = 0. 


By Theorem[2.6.1] this system of linear equations has a non-zero solution, if 
rank (A— AI) <n, or equivalently det(A— AI) = 0. 


So, to solve (6.1.4), we are forced to choose those values of \ € F for which det(A — AI) = 0. Observe 
that det(A — XZ) is a polynomial in \ of degree n. We are therefore lead to the following definition. 


Definition 6.1.3 (characteristic Polynomial) Let A be a matrix of order n. The polynomial det(A — AI) 
is called the characteristic polynomial of A and is denoted by p(A). The equation p(A) = 0 is called the 
characteristic equation of A. If A € F is a solution of the characteristic equation p(A) = 0, then . is called a 
characteristic value of A. 


Some books use the term EIGENVALUE in place of characteristic value. 


Theorem 6.1.4 Let A = [a;;]; ai; € F, for 1 < i,j <n. Suppose A = Ao € F is a root of the characteristic 


equation. Then there exists a non-zero v € F” such that Av = ov. 


PROOF. Since Xo is a root of the characteristic equation, det(A — AoI) = 0. This shows that the matrix 
A— Aol is singular and therefore by Theorem [2.6.1] the linear system 


(A = Aoln)x =0 


has a non-zero solution. 


Remark 6.1.5 Observe that the linear system Ax = Ax has a solution x = 0 for every X € F. So, we 


consider only those x € F” that are non-zero and are solutions of the linear system Ax = dx. 


Definition 6.1.6 (Eigenvalue and Eigenvector) If the linear system Ax = Ax has a non-zero solution 
x € F” for some A € F, then 


1. \ € F is called an eigenvalue of A, 
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2. 04x € F" is called an eigenvector corresponding to the eigenvalue of A, and 


3. the tuple (A, x) is called an eigenpair. 


Remark 6.1.7 To understand the difference between a characteristic value and an eigenvalue, we give 
the following example. 

0 
-1 


Consider the matrix A = . Then the characteristic polynomial of A is 


p(A) = 7 +1. 
Given the matrix A, recall the linear transformation T, : F2—+F? defined by 


Ta(x) = Ax for every x € F”. 


~. 


1. If F=C, that is, if A is considered a COMPLEX matrix, then the roots of p(X) = 0 in C are + 
So, A has (i, (1,7)') and (—i, (i, 1)*) as eigenpairs. 


2. If F=R, that is, if A is considered a REAL matrix, then p(\) = 0 has no solution in R. Therefore, 


if F = R, then A has no eigenvalue but it has +i as characteristic values. 


Remark 6.1.8 Note that if (A,x) is an eigenpair for ann xn matrix A then for any non-zeroc € F, c# 


0, (A, cx) is also an eigenpair for A. Similarly, if x,,x2,...,X, are eigenvectors of A corresponding to 
Tr 


the eigenvalue \, then for any non-zero (c1,C2,...,Cr) € F"’, it is easily seen that if )> 4x; 4 0, then 
1=1 


3 c;X; is also an eigenvector of A corresponding to the eigenvalue A. Hence, when we talk of eigenvectors 
conrespondiug to an eigenvalue 4, we mean LINEARLY INDEPENDENT EIGENVECTORS. 

Suppose Ao € F is a root of the characteristic equation det(A — Aol) = 0. Then A — Aol is singular 
and rank (A — Aol) < n. Suppose rank (A — Aol) = r <n. Then by Corollary[£.3.9| the linear system 
(A — Aol)x = O has n — r linearly independent solutions. That is, A has n — r linearly independent 
eigenvectors corresponding to the eigenvalue Ay whenever rank (A — Aol) =r <n. 


Example 6.1.9 1. Let A = diag(di,d2,...,dn) with dj € R for 1 <i <n. Then p(d) = JJ}, (A— di) 
is the characteristic equation. So, the eigenpairs are 


(iy Oye sO), es, (0,100, 220 ys (d (Ooo), 


11 
2. Let A = j i . Then det(A — Az) = (1 — )?. Hence, the characteristic equation has roots 1, 1. 
That is 1 is a repeated eigenvalue. Now check that the equation (A — Iz)x = O for x = (21,22)! 
is equivalent to the equation x2 = 0. And this has the solution x = (21,0)’. Hence, from the above 
remark, (1,0)! is a representative for the eigenvector. Therefore, HERE WE HAVE TWO EIGENVALUES 
1,1 BUT ONLY ONE EIGENVECTOR. 


3. Let A= 


i . Then det(A — Az) = (1 — A)?. The characteristic equation has roots 1, 1. Here, the 


matrix that we have is Iz and we know that J2x = x for every x’ € R? and we can CHOOSE ANY TWO 
LINEARLY INDEPENDENT VECTORS x‘, y* from R? to get (1,x) and (1,y) as the two eigenpairs. 


In general, if x1,X2,...,Xn are linearly independent vectors in IR”, then (1,x1), (1,x2), ..., (1, Xn) 
are eigenpairs for the identity matrix, In. 
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1 2 
4. Let A= serie Then det(A — Az) = (A — 3)(A + 1). The characteristic equation has roots 3, —1. 


Now check that the eigenpairs are (3, (1, 1)*), and (—1, (1, —1)*). In this case, we have TWO DISTINCT 
EIGENVALUES AND THE CORRESPONDING EIGENVECTORS ARE ALSO LINEARLY INDEPENDENT. 


The reader is required to prove the linear independence of the two eigenvectors. 


1 =] 
5. Let A= © a)" Then det(A— Az) = A? —2+2. The characteristic equation has roots 1+7,1—1. 


Hence, over R, the matrix A has no eigenvalue. Over C, the reader is required to show that the eigenpairs 
are (1+, (i,1)') and (1 — 3, (1,7)). 


Exercise 6.1.10 = 1. Find the eigenvalues of a triangular matrix. 
2. Find eigenpairs over C, for each of the following matrices: 
E ; 1 lt+i i 14a fee Py - pa an 
0 0) ? ? sg 


1-i 1 —1+i i 
3. Let A and B be similar matrices. 


? 


sin@ cos sin@ —cos@ 


(a) Then prove that A and B have the same set of eigenvalues. 
(b) Let (A, x) be an eigenpair for A and (A, y) be an eigenpair for B. What is the relationship between 
the vectors x and y? 


[Hint: Recall that if the matrices A and B are similar, then there exists a non-singular matrix 
P such that B= PAP-']] 


n 
4. Let A = (a;;) be an n x n matrix. Suppose that for all i, 1 <i <n, $0 aj; =a. Then prove that a 
j=l 
is an eigenvalue of A. What is the corresponding eigenvector? 


5. Prove that the matrices A and A‘ have the same set of eigenvalues. Construct a 2 x 2 matrix A such 
that the eigenvectors of A and A’ are different. 


6. Let A be a matrix such that A? = A (A is called an idempotent matrix). Then prove that its eigenvalues 
are either 0 or 1 or both. 


7. Let A be a matrix such that A* = 0 (A is called a nilpotent matrix) for some positive integer k > 1. 
Then prove that its eigenvalues are all 0. 


Theorem 6.1.11 Let A = [a;;] be an n x n matrix with eigenvalues 1, A2,..., An, not necessarily distinct. 


Then det(A) = [J A; and tr(A) = So ay = YO AX. 
i=1 i=l i=1 


PROOF. Since Aj, A2,---,An are the n eigenvalues of A, by definition, 


det(A— AL) =p) = (190 = a dao OH Aa). (6.1.5) 


is an identity in \ as polynomials. Therefore, by substituting \ = 0 in (6.1.5), we get 


det(A) = (—1)"(-1)" II M = []%- 
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Also, 
ai1—A a12 Qin 
a21 az2 — : a2n 
sian = a j (6.1.6) 
ani An2 Bes ann — m 
= tio = Aa + Ga fe: 
ig a (6.1.7) 
for some ag, @1,...,@n—1 € F. Note that a,_1, the coefficient of (—1)"~!A\"~, comes from the product 


(ai1 = A) (a22 = Xd) ae? (Gnn _= d). 


So, @n—1 = >> ay; = tr(A) by definition of trace. 
i=l 
But , from (6.1.5) and (6.1.7), we get 


ag — Aa, + a2 Si re ee me me + (—1)"\” 
= (-1)"A— AiA— ag) (A— An): (6.1.8) 


n—-1yn-1 


Therefore, comparing the coefficient of (—1) , we have 


tr(A) = an_1 = (-1){(-1) > i} = x Ne 


Hence, we get the required result. 


Exercise 6.1.12 1. Let A be a skew symmetric matrix of order 2n +1. Then prove that 0 is an eigenvalue 
of A. 


2. Let A be a 3 x 3 orthogonal matrix (AA‘ = I).If det(A) = 1, then prove that there exists a non-zero 
vector v € R® such that Av = v. 


Let A be an n x n matrix. Then in the proof of the above theorem, we observed that the charac- 
teristic equation det(A — AZ) = 0 is a polynomial equation of degree n in A. Also, for some numbers 
ao,@1,---,@n—1 € F, it has the form 


r” + ca + a a ne ayA + ago = 0. 


Note that, in the expression det(A — AJ) = 0, 2 is an element of F. Thus, we can only substitute by 
elements of F. 
It turns out that the expression 


A” + QnA”! + An—2 A? fee aA + aol =0 


holds true as a matrix identity. This is a celebrated theorem called the Cayley Hamilton Theorem. We 
state this theorem without proof and give some implications. 


Theorem 6.1.13 (Cayley Hamilton Theorem) Let A be a square matrix of order n. Then A satisfies its 
characteristic equation. That is, 


A” + An A”! + An—2 A” tee a A + aol =0 


holds true as a matrix identity. 
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Some of the implications of Cayley Hamilton Theorem are as follows. 


1 
Remark 6.1.14 1. Let A= : pelle Then its characteristic polynomial is p(A) = \*. Also, for 


the function, f(x) =x, f(0)=0, and f(A) =A 0. This shows that the condition f(A) = 0 for 
each eigenvalue of A does not imply that f(A) = 0. 


2. Suppose we are given a square matrix A of order n and we are interested in calculating A‘ where 
£ is large compared to n. Then we can use the division algorithm to find numbers ao, Q1,...,Q@n—1 
and a polynomial f(A) such that 

i = FAQ” + dn—1A"! + ana? ++ A+ ao) 
+ap + AQy +e + NO Gena. 


Hence, by the Cayley Hamilton Theorem, 
Af => aol + aA fees Hf Qm—1 A” 1. 


That is, we just need to compute the powers of A till n —1. 


In the language of graph theory, it says the following: 
“Let G be a graph on n vertices. Suppose there is no path of length n — 1 or less from a vertex v to a 
vertex u of G. Then there is no path from v to u of any length. That is, the graph G is disconnected and 


v and uw are in different components.” 


3. Let A be a non-singular matrix of order n. Then note that ay = det(A) #0 and 


-1 
Aa*t = —[A" 1 +n A"? +++ + aT]. 
an 
This matrix identity can be used to calculate the inverse. 


Note that the vector A~+ (as an element of the vector space of all n x n matrices) is a linear combination 


of the vectors I, A,...,A"~'. 


Exercise 6.1.15 Find inverse of the following matrices by using the Cayley Hamilton Theorem 


2 3 4 —-1 -1 1 1 -—2 -1 
j) (5 6 7| #w#}1 -1 1) +w)l-2 1 -1 
1 1 2 0 1 1 0 -1 2 
Theorem 6.1.16 If X1,A2,..., Ax are distinct eigenvalues of a matrix A with corresponding eigenvectors 
X1,X2,-..,Xg, then the set {x1,x2,...,Xx} is linearly independent. 


ProoFr. The proof is by induction on the number m of eigenvalues. The result is obviously true if 
m = 1 as the corresponding eigenvector is non-zero and we know that any set containing exactly one 
non-zero vector is linearly independent. 

Let the result be true form, 1 <m < k. We prove the result for m+ 1. We consider the equation 


C121 + Cg®@Qq + +++ + Cm4itm+1 = 0 (6.1.9) 
for the unknowns c),C2,..-,Cm+1. We have 
0=AO = A(ceyay + cove +--+ + Gm4i12m+41) 


= cAr +cAveg+---+ Cm41 ALm+1 


= cCyAya, + coAQHQ t+++ + Cm+1Am+1£m+41- (6.1.10) 
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From equations (6.1.9) and (6.1.10), we get 


c2(A2 A1)X2 } c3(A3 A1)X3 bee ee Cm+1(Am+1 = A1)Xm-+1 = 0. 
This is an equation in m eigenvectors. So, by the induction hypothesis, we have 
ci(Ay — Ar) = 0 for 2<i1<m+1. 


But the eigenvalues are distinct implies A; — 41 4 0 for 2 < i < m+ 1. We therefore get c; = 0 for 
2<i<m+1. Also, x; 40 and therefore (6.1.9) gives c; = 0. 
Thus, we have the required result. 


We are thus lead to the following important corollary. 


Corollary 6.1.17 The eigenvectors corresponding to distinct eigenvalues of an n x n matrix A are linearly 
independent. 
Exercise 6.1.18 = 1. For an n x n matrix A, prove the following. 


(a) A and A‘ have the same set of eigenvalues. 
1 
(b) If A is an eigenvalue of an invertible matrix A then x is an eigenvalue of A~?. 
(c) 
(d) 


If \ is an eigenvalue of A then \* is an eigenvalue of A* for any positive integer k. 


If A and B are n x n matrices with A nonsingular then BA~! and A~'B have the same set of 


eigenvalues. 


In each case, what can you say about the eigenvectors? 
2. Let A and B be 2 x 2 matrices for which det(A) = det(B) and tr(A) = tr(B). 


(a) Do A and B have the same set of eigenvalues? 


(b) Give examples to show that the matrices A and B need not be similar. 
3. Let (Ai, u) be an eigenpair for a matrix A and let (Ag, u) be an eigenpair for another matrix B. 


(a) Then prove that (A; + A2, u) is an eigenpair for the matrix A+ B. 


(b) Give an example to show that if \1, Az are respectively the eigenvalues of A and B, then A; + Az 
need not be an eigenvalue of A+ B. 


4. Let \;,1 < i < n be distinct non-zero eigenvalues of an n x n matrix A. Let uj,1 < i < n be 


the corresponding eigenvectors. Then show that 6 = {uj,U2,...,U,} forms a basis of F"(F). If 
[b]~ = (c1,€2,---, Cn)! then show that Ax = b has the unique solution 
Cl C2 Cn 
xX = —U, + —U2+°-':+ Up. 
Moe” An” 


6.2 diagonalization 


Let A be a square matrix of order n and let T4 : F°—>F” be the corresponding linear transformation. 
In this section, we ask the question “does there exist a basis 6 of F” such that T4[B, 6], the matrix of 
the linear transformation T'4, is in the simplest possible form.” 

We know that, the simplest form for a matrix is the identity matrix and the diagonal matrix. In 
this section, we show that for a certain class of matrices A, we can find a basis B such that T4[B, 5] is 
a diagonal matrix, consisting of the eigenvalues of A. This is equivalent to saying that A is similar to a 
diagonal matrix. To show the above, we need the following definition. 
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Definition 6.2.1 (Matrix Diagonalization) A matrix A is said to be diagonalizable if there exists a non- 
singular matrix P such that P~!AP is a diagonal matrix. 


Remark 6.2.2 Let A be ann x n diagonalizable matrix with eigenvalues 1, A2,..-,An- By definition, 
A is similar to a diagonal matrix D. Observe that D = diag(\1, A2,..-, An) as similar matrices have the 


same set of eigenvalues and the eigenvalues of a diagonal matrix are its diagonal entries. 


0 


Example 6.2.3 Let A = 


i : | . Then we have the following: 


1. Let V = R?. Then A has no real eigenvalue (see Example[6.1.8]and hence A doesn't have eigenvectors 
that are vectors in R?. Hence, there does not exist any non-singular 2 x 2 real matrix P such that 
P~'AP is a diagonal matrix. 


2. In case, V = C?(C), the two complex eigenvalues of A are —i,i and the corresponding eigenvectors 
are (i, 1)* and (—i, 1)*, respectively. Also, (i, 1)’ and (—i, 1)* can be taken as a basis of C?(C). Define 


. _i| i 
a 2 x 2 complex matrix by U = fil 4 . Then 


1 
-~i 0 
0 «fe 


Theorem 6.2.4 let A be an nxn matrix. Then A is diagonalizable if and only if A has 7 linearly independent 


U* AU = 


eigenvectors. 


Proor. Let A be diagonalizable. Then there exist matrices P and D such that 
PAP = D = diag(A1, Aa, ---,An): 
Or equivalently, AP = PD. Let P = [u1, ug,...,u,]. Then AP = PD implies that 
Au; =dju; for 1<i<n. 


Since u,’s are the columns of a non-singular matrix P, they are non-zero and so for 1 <i < n, we get 
the eigenpairs (d;,u;) of A. Since, u;’s are columns of the non-singular matrix P, using Corollary [4.3.9] 
we get Uj, U2,...,U,, are linearly independent. 
Thus we have shown that if A is diagonalizable then A has n linearly independent eigenvectors. 
Conversely, suppose A has n linearly independent eigenvectors u;, 1 < i < n with eigenvalues ),. 
Then Au; = A;u;. Let P = [uy, ue,..., up]. Since uj, ue,...,U, are linearly independent, by Corollary 
P is non-singular. Also, 


AP = [Au;, Aug,..., Au,| = [A1t1, A2U2,---,AnUn| 
A 0 0] 
O 2 O 
= Uy, Ug;.+., Uy] : : 3 = PD. 
0 O Ap 


Therefore the matrix A is diagonalizable. 


Corollary 6.2.5 let A be an n x n matrix. Suppose that the eigenvalues of A are distinct. Then A is 
diagonalizable. 
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Proor. As Ais annxn matrix, it has n eigenvalues. Since all the eigenvalues of A are distinct, by Corol- 
lary |6.1.17| the n eigenvectors are linearly independent. Hence, by Theorem A is diagonalizable. 


Corollary 6.2.6 Let A be an n x n matrix with Ai, 2,..., Ax as its distinct eigenvalues and p(A) as its 
characteristic polynomial. Suppose that for each i, 1 < i < k, (x — \;)™ divides p(A) but (x — \;,)"™*+ 


does not divides p(X) for some positive integers m;. Then 
A is diagonalizable if and only if dim (ker(A _ ri) =m, foreach i, 1<i<k. 
Or equivalently A is diagonalizable if and only if rank(A — A;J) =n—m, for each i, 1 <i<k. 


Proor. As A is diagonalizable, by Theorem [6.2.4] A has n linearly independent eigenvalues. Also, 


k 

Som, = n as deg(p(A)) = n. Hence, for each eigenvalue \;, 1 < i < k, A has exactly m, linearly 
i=1 

independent eigenvectors. Thus, for each i, 1 < i < k, the homogeneous linear system (A — A;J)x = 0 


has exactly m,; linearly independent vectors in its solution set. Therefore, dim(ker(A — A;J)) > mi. 
Indeed dim(ker(A — A;,J)) = m; for 1 <i < k follows from a simple counting argument. 
Now suppose that for each i, 1<i<k, dim(ker(A— Xi1)) = mj. Then for each i, 1 <i< k, we can 
choose m; linearly independent eigenvectors. Also by Corollary|6.1.17] the eigenvectors corresponding to 
k 


distinct eigenvalues are linearly independent. Hence A has n = > m, linearly independent eigenvectors. 


t=1 
Hence by Theorem [6.2.4] A is diagonalizable. 
2 1 1 
Example 6.2.7— 1. Let A= | 1 2 1 |. Then det(A— AI) = (2 — \)?(1 — A). Hence, A has 
0 -1 1 


eigenvalues 1,2,2. It is easily seen that (1, (1,0, -1)') and (2, (1, 1,-1)') are the only eigenpairs. 
That is, the matrix A has exactly one eigenvector corresponding to the repeated eigenvalue 2. Hence, 
by Theorem the matrix A is not diagonalizable. 


2 A A 
2. Let A= | 1 2 1]. Then det(A — AJ) = (4 — A)(1 — A)?. Hence, A has eigenvalues 1,1, 4. 
1 1 2 


It can be easily verified that (1,—1,0)* and (1,0,—1)* correspond to the eigenvalue 1 and (1, 1,1) 

corresponds to the eigenvalue 4. Note that the set {(1, —1,0)*, (1,0, —1)'} consisting of eigenvectors 

corresponding to the eigenvalue 1 are not orthogonal. This set can be replaced by the orthogonal 

set {(1,0,—1)’, (1, —2,1)'} which still consists of eigenvectors corresponding to the eigenvalue 1 as 

(1,-2,1) = 2(1,-1,0) — (1,0,-1). Also, the set {(1,1,1),(1,0,—1), (1,—-2,1)} forms a basis of 
1 1 1 


Va V2 V6 

R°. So, by Theorem the matrix A is diagonalizable. Also, if U = w 0 Fz is the 
1 1 4 
VB (2 V6 


corresponding unitary matrix then U* AU = diag(4, 1,1). 
Observe that the matrix A is a symmetric matrix. In this case, the eigenvectors are mutually orthogonal. 
In general, for any 2 xn real symmetric matrix A, there always exist m eigenvectors and they are mutually 


orthogonal. This result will be proved later. 


Exercise 6.2.8 1. By finding the eigenvalues of the following matrices, justify whether or not A = PDP™! 
for some real non-singular matrix P and a real diagonal matrix D. 
, cos@  siné .. |cos@ sind 
i 


f 6 with 0 < 6 < Qn. 
—sin@ cos -) sin@ —cosé ee ea 
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2. Let A be an n X n matrix and B an m X m matrix. Suppose C = 


. Then show that C’ is 
B 
diagonalizable if and only if both A and B are diagonalizable. 
3. Let T.: R° —> R® be a linear transformation with rank (T’— I) = 3 and 
N(T) = {(x1, £2, £3, 4, U5) € R® | jt+4%4+2%5 =0, 2+ 23 = O}. 
Then 


(a) determine the eigenvalues of T’? 
(b) find the number of linearly independent eigenvectors corresponding to each eigenvalue? 


(c) is T diagonalizable? Justify your answer. 


4. Let A be a non-zero square matrix such that A? = 0. Show that A cannot be diagonalized. [Hint: 
Use Remark[6.2.\| 


5. Are the following matrices diagonalizable? 


13 2 1 
1 0 -l 1 -3 3 
) 0 2 3 #1 i) lo 1 0 ii) 0 5 6 
a , wu » bat _ 
0 0 -1 1 
0 0 2 0 -3 4 
00 0 4 


6.3. Diagonalizable matrices 


In this section, we will look at some special classes of square matrices which are diagonalizable. We 
will also be dealing with matrices having complex entries and hence for a matrix A = [a;,;], recall the 


following definitions. 


Definition 6.3.1 (Special Matrices) 1. A* = (Gj), is called the conjugate transpose of the matrix 
A. 


Note that A* = A? = 7. 
2. A square matrix A with complex entries is called 
(a) 
(b) 
(c) a skew-Hermitian matrix if A* = —A. 
(d) a normal matrix if A*A = AA*. 


a Hermitian matrix if A* = A. 


a unitary matrix if A A* = A*A = I, 


3. A square matrix A with real entries is called 


(a) a symmetric matrix if A’ = A. 
(b) an orthogonal matrix if A A‘ = A‘A= I. 


(c) a skew-symmetric matrix if Ab = —A. 


Note that a symmetric matrix is always Hermitian, a skew-symmetric matrix is always skew-Hermitian 
and an orthogonal matrix is always unitary. Each of these matrices are normal. If A is a unitary matrix 
then A* = Aq. 

a 


Example 6.3.2 1. Let B= 


—= a 


i . Then B is skew-Hermitian. 
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1 


1 4 
= at =_ 


1 
i . Then A is a unitary matrix and B is a normal matrix. Note 


that /2A is also a normal matrix. 


Definition 6.3.3 (Unitary Equivalence) Let A and B be two n x n matrices. They are called unitarily 
equivalent if there exists a unitary matrix U such that A = U* BU. 


Exercise 6.3.4. 1. Let A be any matrix. Then A = $(A + A*) + (A — A*) where $(A + A*) is the 
Hermitian part of A and $(A — A*) is the skew-Hermitian part of A. 


2. Every matrix can be uniquely expressed as A = S + iT where both S and T are Hermitian matrices. 
3. Show that A — A* is always skew-Hermitian. 


4. Does there exist a unitary matrix U such that UAU~! = B where 


114 2 -1 3/2 
A=1|0 2 2} andB=|0 1 YW 
0 0 3 0 O 3 


Proposition 6.3.5 Let A be an n x n Hermitian matrix. Then all the eigenvalues of A are real. 


Proor. Let (A,x) be an eigenpair. Then Ax = \x and A = A* implies 


x*A = x* A* = (Ax)* = (Ax)* = \x*. 


Hence 


Ax*x = x*(Ax) = x*(Ax) = (x* A)x = (Ax*)x = Ax*x. 


But x is an eigenvector and hence x 4 0 and so the real number ||x||? = x*x is non-zero as well. Thus 


\ =X. That is, \ is a real number. 


Theorem 6.3.6 Let A be an n x n Hermitian matrix. Then A is unitarily diagonalizable. That is, there 
exists a unitary matrix U such that U* AU = D; where D is a diagonal matrix with the eigenvalues of A as 
the diagonal entries. 


In other words, the eigenvectors of A form an orthonormal basis of C”. 


Proor. We will prove the result by induction on the size of the matrix. The result is clearly true if 
n = 1. Let the result be true for n = k — 1. we will prove the result in case n = k. So, let A beak xk 
matrix and let (A1,x) be an eigenpair of A with ||x|]| = 1. We now extend the linearly independent set 
{x} to form an orthonormal basis {x, uz, u3,...,ux,} (using Gram-Schmidt Orthogonalisation) of C*. 


As {x, Ug, U3,..., Ux} is an orthonormal set, 
u;x=0 forall i1=2,3,...,k. 
Therefore, observe that for all i, 2<i<k, 


(Au;)*x = (u; * A*)x = uj (A*x) = uj (Ax) = uF (Aix) = Ai (uFx) = 0. 
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Hence, we also have x*(Au;) = 0 for 2<i<k. Now, define U; = [x, ua, --- , ug] (with x, uo,..., ug as 
columns of U;). Then the matrix U; is a unitary matrix and 


U; AU, = UfAU, = U}[Ax Auy --: Auy| 
x* Atx*x +++ x* Aur 
uW w3(Aix) + U5 (Aug) 


[Aix Aug se Aug! = 


uz (Aix) -+-  uz(Aug) 


where B is a (k — 1) x (k — 1) matrix. As the matrix U, is unitary, Uf = U,'. So, A* = A gives 
(U;'AU,)* = U;'AU,. This condition, together with the fact that \, is a real number (use Propo- 
sition |6.3.5), implies that B* = B. That is, B is also a Hermitian matrix. Therefore, by induction 
hypothesis there exists a (k — 1) x (K — 1) unitary matrix U2 such that 


Ux + BU2 = Dz = diag(d2,..., Ax). 


Recall that , the entries \;, for 2 <i< k are the eigenvalues of the matrix B. We also know that two 
similar matrices have the same set of eigenvalues. Hence, the eigenvalues of A are Xj, A2,...,Ax- Define 


1 O 
U=U, . Then U is a unitary matrix and 
0 Us, 


-1 
1 O 1 O 
UAU = [UW A(U; 
0 U2 0 U2 
1 0 1 O 
- “| Ur JA 
0 U; 0 U2 
1 0 a 1 0 
= U, AU. 
Us? (M1 1) O Us 
— [2 offa off: of fa 0 
0 U;'||O BI IO Us 0 U;'BU2 
Ar ~6~O 
10) Dal! 
Thus, U~!AU is a diagonal matrix with diagonal entries \;, A2,...,%, the eigenvalues of A. Hence, 


the result follows. 


Corollary 6.3.7 Let A be an n x n real symmetric matrix. Then 
1. the eigenvalues of A are all real, 
2. the corresponding eigenvectors can be chosen to have real entries, and 
3. the eigenvectors also form an orthonormal basis of R”. 


Proor. As A is symmetric, A is also an Hermitian matrix. Hence, by Proposition|6.3.5) the eigenvalues 
of A are all real. Let (A, x) be an eigenpair of A. Suppose x’ € C”. Then there exist y’,z’ € R” such 
that x = y + iz. So, 

Ax = Ax => A(y + iz) = Ay +72). 
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Comparing the real and imaginary parts, we get Ay = Ay and Az = Xz. Thus, we can choose the 
eigenvectors to have real entries. 
To prove the orthonormality of the eigenvectors, we proceed on the lines of the proof of Theorem 


[6.3.6] Hence, the readers are advised to complete the proof. 


Exercise 6.3.8 1. Let A be a skew-Hermitian matrix. Then all the eigenvalues of A are either zero or 
purely imaginary. Also, the eigenvectors corresponding to distinct eigenvalues are mutually orthogonal. 
[Hint: Carefully study the proof of Theorem[6.3.61/ 


2. Let A be an n x n unitary matrix. Then 


(a) the rows of A form an orthonormal basis of C”. 

(b) the columns of A form an orthonormal basis of C”. 
(c) for any two vectors x,y € C™*!, (Ax, Ay) = (x,y). 
(d) for any vector x € C”*?, || Ax|| = ||x||. 

(e) for any eigenvalue X A, |A| = 1. 

(f) the eigenvectors x, y corresponding to distinct eigenvalues A and js satisfy (x,y) = 0. That is, if 


t 
(A, x) and (j, y) are eigenpairs, with \ # y, then x and y are mutually orthogonal. 


3. Let A be a normal matrix. Then, show that if (A,x) is an eigenpair for A then (X,x) is an eigenpair 
for A*. 


10 


4 4 
4. Show that the matrices A = and B= 
0 4 —-4 -2 


| are similar. Is it possible to find a unitary 
matrix U such that A = U* BU? 
5. Let A be a 2 x 2 orthogonal matrix. Then prove the following: 


(a) if det(A) = 1, then A= eee, ee 


for some 6, 0< 6 < 2rn. 
sin@ cos@ 


1 0 
(b) if det A = —1, then there exists a basis of R? in which the matrix of A looks like f | : 


6. Describe all 2 x 2 orthogonal matrices. 
211 

7. Let A= |]1 2 1]. Determine A%°!. 
1 1 2 


8. Let A be a 3 x 3 orthogonal matrix. Then prove the following: 


(a) if det(A) = 1, then A is a rotation about a fixed axis, in the sense that A has an eigenpair (1, x) 


Al 


such that the restriction of A to the plane x+ is a two dimensional rotation of x+. 


(b) if det A = —1, then the action of A corresponds to a reflection through a plane P, followed by a 


10 9 
—-4 -—2 


are similar but not unitarily equivalent, whereas unitary equivalence implies similarity equivalence as 


rotation about the line through the origin that is perpendicular to P. 


Remark 6.3.9 In the previous exercise, we saw that the matrices A = 


4 
| oma 2 = 
4 


U* = U7}. But in numerical calculations, unitary transformations are preferred as compared to similarity 


transformations. The main reasons being: 
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1. Exercise|6.3.8]2| implies that an orthonormal change of basis leaves unchanged the sum of squares 
of the absolute values of the entries which need not be true under a non-orthonormal change of 


basis. 
2. As U* =U~! for a unitary matrix U, unitary equivalence is computationally simpler. 
3. Also in doing “conjugate transpose”, the loss of accuracy due to round-off errors doesn’t occur. 


We next prove the Schur’s Lemma and use it to show that normal matrices are unitarily diagonaliz- 
able. 


Lemma 6.3.10 (Schur's Lemma) Every n x n complex matrix is unitarily similar to an upper triangular 


matrix. 


Proor. We will prove the result by induction on the size of the matrix. The result is clearly true 
if n = 1. Let the result be true for n = k — 1. we will prove the result in case n = k. So, let A be a 
k x k matrix and let (\1,2) be an eigenpair for A with ||x|| = 1. Now the linearly independent set {2} is 


extended, using the Gram-Schmidt Orthogonalisation, to get an orthonormal basis {, u2, u3,..., UK}. 
Then U, = [x ug --- ux] (with x,u2,..., uz as the columns of the matrix U; ) is a unitary matrix and 
U;'AU, = U}AU, =U} [Ax Aug --- Aux) 
a 
ug 
= . [Ava Aug ++ Aug = 
: B 
Uz, 0 


where B is a (k — 1) x (k — 1) matrix. By induction hypothesis there exists a (&k — 1) x (k — 1) unitary 

matrix U2 such that Uz ' BU. is an upper triangular matrix with diagonal entries A2,...,A,, the eigen 

values of the matrix B. Observe that since the eigenvalues of B are X2,...,Ax% the eigenvalues of A are 
1 O 

Ay, A2,.-.,AK- Define U = U, a te? Then check that U is a unitary matrix and U~'AU is an upper 
2 

triangular matrix with diagonal entries 41, A2,..., Ax, the eigenvalues of the matrix A. Hence, the result 


follows. 


Exercise 6.3.11 1. Let A be ann x7 real invertible matrix. Prove that there exists an orthogonal matrix 
P and a diagonal matrix D with positive diagonal entries such that AA* = PDP7!. 


2 -1 V2 


1 11 
2. Show that matrices A= |0 2 1] and B=/0 1 0 | are unitarily equivalent via the unitary 
0 0 8 


0 0 3 
1 1 0 
matrix U = A 1 -—1 0 |. Hence, conclude that the upper triangular matrix obtained in the 
0 0 7 


”Schur’s Lemma” need not be unique. 


3. Show that the normal matrices are diagonalizable. 
[Hint: Show that the matriz B in the proof of the above theorem is also a normal matrix and if T 


is an upper triangular matrix with T*T = TT* then T has to be a diagonal matriz]. 


Remark 6.3.12 (The Spectral Theorem for Normal Matrices) Let A be ann x n normal 
matrix. Then the above exercise shows that there exists an orthonormal basis {x,,xX2,...,Xn} of 
C"(C) such that Ax; = ;x; for 1 <i<n. 
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4. Let A be a normal matrix. Prove the following: 


(a) if all the eigenvalues of A are 0, then A = 0, 
(b) if all the eigenvalues of A are 1, then A= J. 


We end this chapter with an application of the theory of diagonalization to the study of conic sections 


in analytic geometry and the study of maxima and minima in analysis. 


6.4 Sylvester’s Law of Inertia and Applications 


Definition 6.4.1 (Bilinear Form) Let A be a n x n matrix with real entries. A bilinear form in x = 


(v1, %2,---,2n)*, Y = (Y1,Y2;-+-;Yn)* is an expression of the type 


n 
Q(x,y) = x'Ay = Se AijLiYs- 
i,j=l 
Observe that if A = I (the identity matrix) then the bilinear form reduces to the standard real inner 
product. Also, if we want it to be symmetric in x and y then it is necessary and sufficient that aj; = aj; 
for all 1,7 = 1,2,...,n. Why? Hence, any symmetric bilinear form is naturally associated with a real 


symmetric matrix. 


Definition 6.4.2 (Sesquilinear Form) Let A be an x nm matrix with complex entries. A sesquilinear form 


in x = (41, %2,.-.,2n)*, y = (Y1,Y2;---;Yn)* is given by 


Note that if A =I (the identity matrix) then the sesquilinear form reduces to the standard complex 
inner product. Also, it can be easily seen that this form is ‘linear’ in the first component and ‘conjugate 
linear’ in the second component. Also, if we want H(x,y) = H(y,x) then the matrix A need to be an 
Hermitian matrix. Note that if aj; € R and x,y € R”, then the sesquilinear form reduces to a bilinear 
form. 

The expression Q(x, x) is called the quadratic form and H(x,x) the Hermitian form. We generally 
write Q(x) and H(x) in place of Q(x, x) and H(x,x), respectively. It can be easily shown that for any 
choice of x, the Hermitian form H(x) is a real number. 


Therefore, in matrix notation, for a Hermitian matrix A, the Hermitian form can be rewritten as 


H(x) = x' Ax, where x = (21,%2,...,2n)*, and A = [aj]. 


Example 6.4.3 Let A = 


9-5 
2+1 2 i . Then check that A is an Hermitian matrix and for x = (21, 22)!, 
7 


1 2-1 Ly 
2+%1 2 x2 
= Bx, + Were + (2—i)Fpw2 + (24 i)Fex1 


= |x1|? + 2|ae|? + 2Re[(2 — i)z7 22] 


the Hermitian form 


A(x) = x*Ax = (27,7) 


where ‘Re’ denotes the real part of a complex number. This shows that for every choice of x the Hermitian 


form is always real. Why? 
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The main idea is to express H(x) as sum of squares and hence determine the possible values that 
it can take. Note that if we replace x by cx, where c is any complex number, then H(x) simply gets 
multiplied by |c|? and hence one needs to study only those x for which ||x|| = 1, ie., x is a normalised 
vector. 

From Exercise|6.3.11)3]one knows that if A = A* (A is Hermitian) then there exists a unitary matrix 
U such that U*AU = D (D = diag(A1, A2,-.--,An) with »,’s the eigenvalues of the matrix A which we 
know are real). So, taking z = U*x (ie., choosing z;’s as linear combination of x;’s with coefficients 
coming from the entries of the matrix U*), one gets 

7 , : 2 
H(x) = x" Ax = 2"U* AUz = 2" Dz = >> lz? = 0A |S uge*ay| (6.4.1) 
i=1 9 | j=l 


i=l 
Thus, one knows the possible values that H(x) can take depending on the eigenvalues of the matrix A 
n 


in case A is a Hermitian matrix. Also, for 1 <i<n, }> uj;*x; represents the principal axes of the conic 
j=1 
that they represent in the n-dimensional space. 


Equation (6.4.1) gives one method of writing H(x) as a sum of n absolute squares of linearly inde- 
pendent linear forms. One can easily show that there are more than one way of writing H(x) as sum of 
squares. The question arises, “what can we say about the coefficients when H(x) has been written as 
sum of absolute squares” . 


This question is answered by ‘Sylvester’s law of inertia’ which we state as the next lemma. 


Lemma 6.4.4 Every Hermitian form H(x) = x*Ax (with A an Hermitian matrix) in n variables can be 
written as 

H(x) = |yal? + lyal? +--+ + lypl? — lyptal? — ++ — [yel? 
where yj, Y2,---, Yr are linearly independent linear forms in 21, 22,...,2%n, and the integers p andr, O< p< 
r <n, depend only on A. 


Proor. From Equation (6.4.1) it is easily seen that H(x) has the required form. Need to show that p 
and r are uniquely given by A. 
Hence, let us assume on the contrary that there exist positive integers p,q,r,s with p > q such that 


H(x) = ysl? + |yel? +--+ + lypl? — lyptal? — ++ — lgel? 

= |aal? + lea]? +--+ + eal? — lzgtal? — +++ — |zel?. 
Since, y = (yi, Y2,---, Yn)’ and z = (21, 22,--., Zn)* are linear combinations of x1, 72,...,%n, we can find 
a matrix B such that z = By. Choose Yp41 = Ypt42 = ++: = Yr = 0. Since p > q, Theorem |2.6.1} gives 
the existence of finding nonzero values of 1, y2,...,Yp such that z] = zg =--- = 2, = 0. Hence, we get 


yal? + lyel? +--+ + lypl? = —(leqtil? +--+ + [2s]?). 


Now, this can hold only if y; = y2 =--: = yp = 0, which gives a contradiction. Hence p = q. 


Similarly, the case r > s can be resolved. 


Note: The integer r is the rank of the matrix A and the number r — 2p is sometimes called the 
inertial degree of A. 
We complete this chapter by understanding the graph of 


ax” + 2hary + by? + 2fa+ 2y+c=0 


for a,b,c, f,g,h € R. We first look at the following example. 
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Example 6.4.5 Sketch the graph of 327 + 4xy + 3y? =5. 


Solution: Note that 


3 2 
3a? + 4ry + 3y? = [x, y| lls 
2 3] ly 


2 
The eigenpairs for : : are (5, (1,1)*), (1, (1, -1)’). Thus, 


Let 


Then 


3a? + dry + 3y? = [2, a 


Thus the given graph reduces to 
v2 
5u2 +u% =5 or equivalently wt 5 =. 


Therefore, the given graph represents an ellipse with the principal axes u = 0 and v = 0. That is, the principal 
axes are 


yta=Oand x—y=0. 


The eccentricity of the ellipse is e = wet the foci are at the points S; = (—V2, V2) and Sp = (V2, —V2), 
and the equations of the directrices are x - y=+ 


ae 


Figure 6.1: Ellipse 
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Definition 6.4.6 (Associated Quadratic Form) Let ax? +2hary+ by? +2gx+2fy+c = 0 be the equation 


of a general conic. The quadratic expression 


2 5 a h| \« 
+ 2hay + by~ = |2, 


is called the quadratic form associated with the given conic. 


We now consider the general conic. We obtain conditions on the eigenvalues of the associated 
quadratic form to characterize the different conic sections in R? (endowed with the standard inner 
product). 


Proposition 6.4.7 Consider the general conic 
ax® + 2hay + by? + 29x +2fytc=0. 
Prove that this conic represents 
1. an ellipse if ab — h? > 0, 
2. a parabola if ab — h? = 0, and 


3. a hyperbola if ab — h? < 0. 


a 


Proor. Let A= . Then the associated quadratic form 


ax” + 2hay + by? = [z yA 7 


As A is a symmetric matrix, by Corollary [6.3.7] the eigenvalues A1,A2 of A are both real, the corre- 
sponding eigenvectors u,, U2 are orthonormal and A is unitarily diagonalizable with 


ul AL 0 
A= "| ; 7 [ur ug]. (6.4.2) 


. . Then 


Let " =[u, uw] 


ax? + 2hay + by? = \yu? + Agv? 

and the equation of the conic section in the (u,v)-plane, reduces to 
Ayu2 + Agu? + 2giu+2fiv+c=0. 
Now, depending on the eigenvalues ;, A2, we consider different cases: 


1. Ay =O0= do. 
Substituting A; = A2 = 0 in (6.4.2) gives A = 0. Thus, the given conic reduces to a straight line 
2giu + 2fiv+c=0 in the (u,v)-plane. 


2. A, = 0,A2 # 0. 
In this case, the equation of the conic reduces to 


A2(u + d,)° =dju+d3 forsome dj, d2,d3 € R. 


(a) If dy = ds = 0, then in the (u, v)-plane, we get the pair of coincident lines v = —d1. 
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(b) If do =0, ds £0. 


id 
i. If Ag - ds > 0, then we get a pair of parallel lines v = —d, + os 
2 
ii. If Ag - d3 < 0, the solution set corresponding to the given conic is an empty set. 


(c) If dz 4 0. Then the given equation is of the form Y? = 4aX for some translates X = 2 +a 
and Y = y+ and thus represents a parabola. 


Also, observe that \; = 0 implies that the det(A) = 0. That is, ab — h? = det(A) = 0. 


3. Ay > O and Az < 0. 
Let A2 = —a2. Then the equation of the conic can be rewritten as 


Ai(u + dy)? — a2(v + dz)? =d3 for some dj,d2,d3 € R. 
In this case, we have the following: 
(a) suppose d3 = 0. Then the equation of the conic reduces to 
Ai(u + di)? — ag(v + dz)? = 0. 


The terms on the left can be written as product of two factors as A,,@2 > 0. Thus, in this 
case, the given equation represents a pair of intersecting straight lines in the (u, v)-plane. 


(b) suppose ds 4 0. As ds 4 0, we can assume d3 > 0. So, the equation of the conic reduces to 


Ai(u + d,)? =i Q2(v + dz)? 


= 1. 
ds d3 


This equation represents a hyperbola in the (u, v)-plane, with principal axes 
ut+d,=Oand v+d.=0. 


As AyAQ <0, we have 
ab — h? = det (A) = A,r. < 0. 


4. Aq, A2 > 0. 


In this case, the equation of the conic can be rewritten as 
Ar(u + dy)? + A2(v + do)? =d3, forsome dj,d2,d3€R. 
we now consider the following cases: 


(a) suppose d3 = 0. Then the equation of the ellipse reduces to a pair of perpendicular lines 
u+d, =0 and v + dz = 0 in the (u, v)-plane. 


(b) suppose d3 < 0. Then there is no solution for the given equation. Hence, we do not get any 
real ellipse in the (u, v)-plane. 


(c) suppose ds > 0. In this case, the equation of the conic reduces to 


Ai(u + d,)? 4 a2 (vu + dz)? 


=, 
ds ds 


This equation represents an ellipse in the (u,v)-plane, with principal axes 


u+d, =0 and v+d2=0. 
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Also, the condition \;A2 > 0 implies that 


ab — h? = det (A) = ri A2 > 0. 


Remark 6.4.8 Observe that the condition 
u x 
= [um ws] 
v y 
implies that the principal axes of the conic are functions of the eigenvectors u, and Up. 


Exercise 6.4.9 Sketch the graph of the following surfaces: 


1. x? + 2ry + y? — 6x — 10y = 3. 


2. 2a? + Gry + 3y? — 122 — 6y = 5. 


3. 4x? — dary + 2y? + 122 — 8y = 10. 


4. 2x? — 6ry + 5y? — 10x + 4y = 7. 


As a last application, we consider the following problem that helps us in understanding the quadrics. 
Let 


ax” + by? + cz” + Qdary + 2exz + 2fyz + Ala + Amy + 2nz+q=0 (6.4.3) 


be a general quadric. Then we need to follow the steps given below to write the above quadric in the 
standard form and thereby get the picture of the quadric. The steps are: 


1. Observe that this equation can be rewritten as 


x'Ax+b’x+q=0, 


where 
ade 21 x 
A=|d b ff], b=|2m|, and x=|y 
e fc 2n Zz 


2. As the matrix A is symmetric matrix, find an orthogonal matrix P such that P*AP is a diagonal 


matrix. 
3. Replace the vector x by y = P*x. Then writing y’ = (y1, y2, y3), the equation (6.4.3) reduces to 
Ary? + Aoys + Asy3 + Zliy1 + Zloy2 + 2lsys + q/ = 0 (6.4.4) 
where 1, A2,A3 are the eigenvalues of A. 


4. Complete the squares, if necessary, to write the equation (6.4.4) in terms of the variables 21, z2, 23 
so that this equation is in the standard form. 


5. Use the condition y = P'x to determine the centre and the planes of symmetry of the quadric in 
terms of the original system. 
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Example 6.4.10 Determine the quadric 2x? + 2y? 4 
ae hae [4 


1 2 4 


227 + Qey + Qez+ 2yz+4e+2y+4z+2=0. 


Solution: In this case, A = F 2 | and b = F and q = 2. Check that for the orthonormal matrix 


_PtAP= 


Go CoO FF Ye 
oro 


0 
0| . So, the equation of the quadric reduces to 
1 


are ae 


1 
V6 
ss 
V6 
=2 
V6 


al SI al- 


1 2 De sified on 
Wiig | Tha ats 1 —vU:. 


4yi + yo +34 


Or equivalently, 


5 24 | 1 2, 2 
A(u + 779) t (yo + Te) + (y3 ar = Te 


So, the equation of the quadric in standard form is 


9 
4p + ept+e3=—, 
4 +2 +23 = 75 
where the point (x,y, 2)’ = P= ot sn) = (=, 4,— 2)! is the centre. The calculation of the planes of 


symmetry is left as an exercise to the reader. 
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Part II 


Ordinary Differential Equation 


Chapter 7 


Differential Equations 


7.1 Introduction and Preliminaries 


There are many branches of science and engineering where differential equations naturally arise. Now a 
days there are applications to many areas in medicine, economics and social sciences. In this context, the 
study of differential equations assumes importance. In addition, in the elementary study of differential 
equations, we also see the applications of many results from analysis and linear algebra. Without 
spending more time on motivation, (which will be clear as we go along) let us start with the following 
notations. Suppose that y is a dependent variable and x is an independent variable. The derivatives of 
y (with respect to x) are denoted by 


—_ dy " d’y (k) d‘*) 


y 
Uda dx2? "74 de®) 23 


The independent variable will be defined for an interval I; where J is either R or an interval a < x < 
b CR. With these notations, we ask the question: what is a differential equation? 
A differential equation is a relationship between the independent variable and the unknown dependent 


functions along with its derivatives. 


Definition 7.1.1 (Ordinary Differential Equation, ODE) An equation of the form 
f(z,y,y',...,y™) =0 for cel (7.1.1) 


is called an ORDINARY DIFFERENTIAL EQUATION; where f is a known function from J x R"*! to R. Also, 


the unknown function y is to be determined. 


Remark 7.1.2 Usually, Equation (7.1) is written as f(x,y,y',...,y') =0, and the interval I is not 


mentioned in most of the examples. 
Some examples of differential equations are 
1. y’ =6sinz + 9; 
2. y” + 2y? = 0; 
3. Jy’ = Jz + cosy; 
4. (y')?+y=0. 
5. y +y=0. 
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6. y+y=0. 
7. yO =0. 


8. y” +msin(y) = 0. 


Definition 7.1.3 (Order of a Differential Equation) The ORDER of a differential equation is the order of 
the highest derivative occurring in the equation. 


In Example the order of Equations are one, that of Equations 2] [6] and [8] are two and 
the Equation [7] has order three. 


Definition 7.1.4 (Solution) A function y = f(x) is called a SOLUTION of a differential equation on I if 
1. f is differentiable (as many times as the order of the equation) on J and 


2. y Satisfies the differential equation for all a € J. 


Example 7.1.5 — 1. Show that y = ce~* is a solution of y’ + 2y = 0 on R for a constant c € R. 
Solution: Let x € R. By direct differentiation we have y’ = —2ce~?* = —2y. 


is a solution of 


2. Show that for any constantaeé R, y= 7 ' 


(l—a)y’-y=0 


on (—oo, 1) or on (1,00). Note that y is not a solution on any interval containing 1. 
Solution: It can be easily checked. 


Remark 7.1.6 Sometimes a solution y is also called an INTEGRAL. A solution of the form y = g(x) is 
called an EXPLICIT SOLUTION. If y is given by an implicit relation h(x, y) = 0 and satisfies the differential 
equation, then y is called an IMPLICIT SOLUTION. 


Remark 7.1.7 Since the solution is obtained by integration, we may expect a constant of integration 
(for each integration) to appear in a solution of a differential equation. If the order of the ODE is n, we 
expect n(n > 1) arbitrary constants. 


To start with, let us try to understand the structure of a first order differential equation of the form 


f(a,y,y’) =0 (7.1.2) 


and move to higher orders later. With this in mind let us look at: 


Definition 7.1.8 (General Solution) A function y(x,c) is called a general solution of Equation (7.1.2) on 
an interval I C R, if y(x,c) is a solution of Equation (7.1.2) for each x € I, for a fixed c € R but c is 
arbitrary. 


Remark 7.1.9 The family of functions {y(.,c) : c € R} is called a one parameter family of functions 
and c is called a parameter. In other words, a general solution of Equation [7.1.2) is nothing but a one 
parameter family of solutions of the Equation (7.1.2). 


Example 7.1.10 1. Show that for each k € R, y = ke® is a solution of y’ = y. This is a general solution 
as it is a one parameter family of solutions. Here the parameter is k. 


Solution: This can be easily verified. 


7.1. 
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. Determine a differential equation for which a family of circles with center at (1,0) and arbitrary radius, 


a is an implicit solution. 


Solution: This family is represented by the implicit relation 
(2-1)? +47 =e’, (7.13) 


where a is a real constant. Then y is a solution of the differential equation 
dy _ 
dx 
The function y satisfying Equation (7.1.3) is a one parameter family of solutions or a general solution 


of Equation (7.1.4). 


(w—-1)+y 0. (7.1.4) 


. Consider the one parameter family of circles with center at (c,0) and unit radius. The family is 


represented by the implicit relation 
(2—c)? +y? =1, (7.1.5) 


where c is a real constant. Show that y satisfies (yy’)* +y=1. 


Solution: We note that, differentiation of the given equation, leads to 
(x —c)+yy’ =0. 
Now, eliminating c from the two equations, we get 


(yy)? ty? =1. 


In Example |/.1.10]2| we see that y is not defined explicitly as a function of x but implicitly defined 
1 
by Equation (7.1.3). On the other hand y = 7 


is an explicit solution in Example Solving a 


differential equation means to find a solution. 


Let us now look at some geometrical interpretations of the differential Equation (7.1.2). The Equation 


(7.1.2) is a relation between x, y and the slope of the function y at the point z. For instance, let us find 


1 
the equation of the curve passing through (0, a) and whose slope at each point (x, y) is -=. If y is the 
y 


required curve, then y satisfies 


dy x 
—=-—, y(0) = =. 
ae, y(0) 


It is easy to verify that y satisfies the equation x? + 4y? = 1. 


Exercise 7.1.11 1. Find the order of the following differential equations: 


(a) y?+sin(y’) = 1. 
(b) y+ (y’)? = 2a. 
() yo? +9" 29° = =1. 


. Find a differential equation satisfied by the given family of curves: 


(a) y= mz, m real (family of lines). 
(b) y? = 4az, a real (family of parabolas). 


(c) =r? cos#, y=r?sin@, 6 is a parameter of the curve and r is a real number (family of circles 
in parametric representation). 


. Find the equation of the curve C' which passes through (1,0) and whose slope at each point (2, y) is 


—@ 


y 
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7.2  Separable Equations 


In general, it may not be possible to find solutions of 


y’ = f(x,y) 


where f is an arbitrary continuous function. But there are special cases of the function f for which the 


above equation can be solved. One such set of equations is 
y' = g(y)h(a). (7.2.1) 


Equation (7.2.1) is called a SEPARABLE EQUATION. The Equation (7.2.1) is equivalent to 


Integrating with respect to x, we get 


H(e) = f haar = f Bay = [ “= GWy) +6 


where c is a constant. Hence, its implicit solution is 


Example 7.2.1 — 1. Solve: y’ = y(y— 1). 
Solution: Here, g(y) = y (y—1) and h(x) = 1. Then 


rer 


By using partial fractions and integrating, we get 


_ 1 
US [_ ete’ 


where c is a constant of integration. 


2. Solve x! = y?. 
1 
Solution: It is easy to deduce that y = Se where c is a constant; is the required solution. 
tte 
Observe that the solution is defined, only if « + c 4 0 for any x. For example, if we let y(0) = a, then 
yo 


a : 
exists as long as ax —1 4 0. 
ax—1 


7.2.1 Equations Reducible to Separable Form 


There are many equations which are not of the form [7.2.1] but by a suitable substitution, they can be 
reduced to the separable form. One such class of equation is 


x 
y= ¥) or equivalently 4’ = g(Z) 
y x 
where gi and gg are homogeneous functions of the same degree in x and y, and g is a continuous function. 
In this case, we use the substitution, y = ru(x) to get y’ = ru’ + u. Thus, the above equation after 
substitution becomes 


cu’ + u(x) = g(u), 


which is a separable equation in wu. For illustration, we consider some examples. 
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Example 7.2.2 1. Find the general solution of 2xyy’ — y? + 2? = 0. 
Solution: Let J be any interval not containing 0. Then 


(ey a EG 
x x 
Letting y = xru(x), we have 


Qu(u'e +u)—u?+1=0 or 2ruu’ +u? +1 =0 or equivalently 


On integration, we get 


or 
zg? +y? — ce =0. 


This represents a family of circles with center ($,0) and radius $. 


2. Find the equation of the curve passing through (0,1) and whose slope at each point (a, y) is —- 


Solution: If y is such a curve then we have 


Notice that it is a separable equation and it is easy to verify that y satisfies x? + 2y? = 2. 


3. The equations of the type 
dy aat+tbyt+ec 
de aout boy + c2 
can also be solved by the above method by replacing « by 7 +h and y by y+k, where h and k are to 


be chosen such that 


aah+bjk +e, =0=agh+ bok + co. 


+b 
This condition changes the given differential equation into dy See ee Thus, if 2 4 0 then the 
dx agx + boy 


equation reduces to the form y’ = g(4). 
Exercise 7.2.3. 1. Find the general solutions of the following: 

d 

(a) of. —a(Inx)(Iny). 
dx 

dy 
b) y-tcos~' +(e? + 1) =0. 
(b) y~* cos~* +(e” + laa 0 
2. Find the solution of 


(a) (2a? +r?) =r? cos ie r(0) =a. 
dr 
d 
(b) cettv =, (0) =0. 


3. Obtain the general solutions of the following: 


d 
(a) {y — xcosec peg: 
x dx 
(b) ay’ =y+ Va? +y?. 
dy = «—yt2 
(<) de —x+y+2° 


4. Solve y’ = y — y? and use it to determine lim y. This equation occurs in a model of population. 
w— oo 
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7.3 Exact Equations 


As remarked, there are no general methods to find a solution of Equation (7.1.2). The Exact EQUATIONS 
is yet another class of equations that can be easily solved. In this section, we introduce this concept. 
Let D be a region in xy-plane and let M and N be real valued functions defined on D. Consider an 


equation 


M(a,y)+ N(a, ye = 0, (x,y) € D. (7.3.1) 


In most of the books on Differential Equations, this equation is also written as 
M(z,y)dx + N(x, y)dy =0, (x,y) € D. (7.3.2) 


Definition 7.3.1 (Exact Equation) Equation (7.3.1) is called Exact if there exists a real valued twice con- 
tinuously differentiable function f : R?—+R (or the domain is an open subset of IR?) such that 


Of _ Of _ 


Remark 7.3.2 If Equation (7.3.1) is exact, then 


af, af dy — f(x,y) 


dx Oy dx da ~ 


This implies that f(x,y) =c (where c is a constant) is an implicit solution of Equation (7.3.1). In other 
words, the left side of Equation {7.3.1} is an exact differential. 


Example 7.3.3 The equation y+oe = 0 is an exact equation. Observe that in this example, f(z, y) = xy. 


The proof of the next theorem is given in Appendix [14.6.2] 


Theorem 7.3.4 Let M and N be twice continuously differentiable function in a region D. The Equation 
(7.3.1) is exact if and only if 


(7.3.4) 


Note: If the Equation (73.1) or Equation (7.3.2) is exact, then there is a function f(z, y) satisfying 


f(x,y) =c for some constant c, such that 
d(f(x,y)) = M(x, y)dx + N(x, y)dy = 0. 


Let us consider some examples, where Theorem [7.3.4]can be used to easily find the general solution. 


Example 7.3.5 — 1. Solve 
d 
Qre¥ + (x*e¥ + cosy = =0. 
dx 


Solution: With the above notations, we have 
OM ON 
M = 2ze¥, N = x?e¥ + cosy, —— = 2xe¥ and —— = 2ze’. 
Oy Ox 


Therefore, the given equation is exact. Hence, there exists a function G(x, y) such that 


ae = 2xe¥ and ae = 27e¥ + cosy. 
Ox Oy 
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The first partial differentiation when integrated with respect to x (assuming y to be a constant) gives, 
G(x, y) = xe" + h(y). 


But then 
OG O(x7e¥ + h(y)) 


ee =N 
Oy Oy 


implies on = cosy or h(y) = siny + c where c is an arbitrary constant. Thus, the general solution of 
the given equation is 


xe¥ +siny =c. 
The solution in this case is in implicit form. 


2. Find values of £ and m such that the equation 
d 
ly? + may =0 
dx 


is exact. Also, find its general solution. 
Solution: In this example, we have 


M=fy?, N=mey, or = ly and oN = my. 
'Yy Ay 


Hence for the given equation to be exact, m = 2¢. With this condition on @ and m, the equation 
reduces to 


d 
fy? + Way =0. 
dx 
This equation is not meaningful if 2 = 0. Thus, the above equation reduces to 


d 2\ _ 
qq ty) =9 


whose solution is 


ry? =c¢ 
for some arbitrary constant c. 
3. Solve the equation 


(3x7e¥ — 2*)dr + (x%e¥ + y*)dy = 0. 


Solution: Here 
M = 327e¥ — 2? and N=2%e% +y?. 


Hence, au = a = 3x7e¥. Thus the given equation is exact. Therefore, 


3 
Gly) = foe" — 2)dn = ae! — + hy) 


(keeping y as constant). To determine h(y), we partially differentiate G(x, y) with respect to y and 
compare with N to get h(y) = ss Hence 


3 3 
G(a,y) =e! — + 2 =e 


is the required implicit solution. 
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7.3.1 Integrating Factors 
On may occasions, 


d 
M(a,y)+ N(a, "> =0, or equivalently M(x, y)dx + N(x, y)dy =0 


may not be exact. But the above equation may become exact, if we multiply it by a proper factor. For 
example, the equation 
ydx — dy = 0 


is not exact. But, if we multiply it with e~*, then the equation reduces to 
e “yda —e~*dy =0, or equivalently d(e~*y) = 0, 


an exact equation. Such a factor (in this case, e~”) is called an INTEGRATING FACTOR for the given 
equation. Formally 


Definition 7.3.6 (Integrating Factor) A function Q(2, y) is called an integrating factor for the Equation 
(7.3.1), if the equation 
Q(x, y) M(x, y)dx + Q(x, y)N (x, y)dy = 0 


is exact. 


Example 7.3.7 1. Solve the equation ydx — xdy = 0, x,y > 0. 


1 


Solution: It can be easily verified that the given equation is not exact. Multiplying by rare the equation 


reduces to 


1 1 
—ydx — —ady =0, or equivalently d(Inz —Iny) =0. 
ry vy 


Thus, by definition, — is an integrating factor. Hence, a general solution of the given equation is 
vy 


G(x, y) = — =c, for some constant c € R. That is, 
ty 
y = cx, for some constant c€ R. 


2. Find a general solution of the differential equation 


(4y? + 3xy) dx - (32y + 2a*) dy = 0. 


Solution: It can be easily verified that the given equation is not exact. 


METHOD 1: Here the terms M = 4y? + 3xy and N = —(3xy + 2x7) are homogeneous functions of 
degree 2. It may be checked that an integrating factor for the given differential equation is 
1 7 1 
Ma+Ny ay(x+y) 


Hence, we need to solve the partial differential equations 


4 
0G.) _ y4y+3r)_ 4 1 oY (7.3.5) 
Ox ay(et+y) «© x+y 
dG(a,y) _  ~e(By+2z)_ 2 (7.3.6) 
Oy ry(x+y) y @rty 


Integrating (keeping y constant) Equation (7.3.5), we have 


G(x, y) = 4In|2| —In|a + y| + h(y) (7.3.7) 
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and integrating (keeping x constant) Equation (7.3.6), we get 


G(z,y) = —2In|y| —In|2 + y|+4+ g(z). (7.3.8) 
Comparing Equations and (7.3.8), the required solution is 
G(a,y) = 41n|2| — In| + y| — 2In|y| = Inc 
for some real constant c. Or equivalently, the solution is 
at = c(x + yy. 


METHOD 2: Here the terms M = 4y? + 3xy and N = —(3xy + 2x7) are polynomial in x and y. 
Therefore, we suppose that «%y° is an integrating factor for some a, 3 € R. We try to find this a and 


B. 
Multiplying the terms M(x, y) and N(a,y) with 2%y°, we get 
M(a,y) = «%y? (4y? + 32y), and N(a,y) = —a%y? (3xy + 2x7). 


M N 
For the new equation to be exact, we need OMe) = out That is, the terms 
Yy xv 


A(2+ B)x*y +? + 3(1 + B)at%y? 


and 
—3(1 + a)aty'*? — 2(2 + a)ai toy? 


must be equal. Solving for a and 6, we get a = —5 and G = 1. That is, the expression es is also an 
integrating factor for the given differential equation. This integrating factor leads to 


3 
y y 
G(z,y) =-= - =z + Aly) 
Z Mb 
and . 
yey 


G(z,y) = er a + g(z). 


Thus, we need h(y) = g(x) = c, for some constant c € R. Hence, the required solution by this method 


IS 


y?(y + x) = cg". 


Remark 7.3.8 1. If Equation (7.3.1) has a general solution, then it can be shown that Equation 
(7.3.1) admits an integrating factor. 


2. If Equation (7.3.1) has an integrating factor, then it has many (in fact infinitely many) integrating 
factors. 


3. Given Equation (7.3.1), whether or not it has an integrating factor, is a tough question to settle. 
4. In some cases, we use the following rules to find the integrating factors. 


(a) Consider a homogeneous equation M(x, y)dx + N(x, y)dy = 0. If 


1 
M N th ——_——_. 
z+ Ny #0, en Fa+ Ny 


is an Integrating Factor. 
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b) If the functions M(x and N(x are polynomial functions in x,y; then «“y® works as an 
oY oY poly. iY; y 
integrating factor for some appropriate values of a and (. 


(c) The equation M(a,y)dx + N(a,y)dy = 0 has eS !@4* as an integrating factor, if f(x) = 


1 /O0M ON\. Hinee st 
— | — —- — isa on of x alone. 
W\ dy Oe is a function of x alon 
(d) The equation M(x, y)dxz + N(«,y)dy = 0 has e~ S949 ag an integrating factor, if g(y) = 
1 (OM N\ , ete ae 
— | — —- — ] is ction o one. 
We Op DE is a function of y alone 


(e) For the equation 
yMi(ay)dx + «Ni(xy)dy = 0 


1 
with Mx — Ny 4 0, the function ————— is an integrating factor. 
Mx— Ny 


Exercise 7.3.9 1. Show that the following equations are exact and hence solve them. 


(a) (r +sin6 + cos) +r(cos 6 —sin8) = 0. 


d 
(b) (e-" = ng 2) + ( Ing + cosy)—~ = 0. 
xr dx 


x 
y 
2. Find conditions on the function g(x,y) so that the equation 


d 
(x? + ay?) + {ax?y? + g(e,y)}—= =0 


is exact. 


3. What are the conditions on f(x), g(y), ¢(a), and w(y) so that the equation 


((0) + ¥ly)) + (F(a) + oy) F = 0 


is exact. 


4. Verify that the following equations are not exact. Further find suitable integrating factors to solve 
them. 


dy 


(a) y+ (a + xy?) — = 0. 
v 

dy 
b) y? 2_1)—~ =0. 
(b) y° + Bry + y*°— 1) =0 

d 

(c) yt (w+ ay?) =0. 

d 
(d) y? + (3ay + y? — j= =0. 

dx 

5. Find the solution of 
d 

(a) (a?y + 2ay?) + 2(a? 4 327y) = 0 with y(1) =0 
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7.4 Linear Equations 


Some times we might think of a subset or subclass of differential equations which admit explicit solutions. 


d 
This question is pertinent when we say that there are no means to find the explicit solution of oe 
xv 
f(a,y) where f is an arbitrary continuous function in (x,y) in suitable domain of definition. In this 
context, we have a class of equations, called Linear Equations (to be defined shortly) which admit explicit 


solutions. 


Definition 7.4.1 (Linear/Nonlinear Equations) 


Let p(x) and g(a) be real-valued piecewise continuous functions defined on interval I = [a,b]. The equation 
y' +p(r)y =q(x), cel (7.4.1) 


d 

is called a linear equation, where y’ stands for — Equation (7.4.1) is called Linear non-homogeneous if 
x 

q(x) #0 and is called Linear homogeneous if a =Oonl. 


A first order equation is called a non-linear equation (in the independent variable) if it is neither a linear 


homogeneous nor a non-homogeneous linear equation. 


Example 7.4.2 1. The equation y’ = siny is a non-linear equation. 
2. The equation y’ + y = sin is a linear non-homogeneous equation. 


3. The equation y’ + xy = 0 is a linear homogeneous equation. 
Define the indefinite integral P(x) = [ p(x)dz ( or f p(s)ds). Multiplying Equation (24.1) by e?™, 


we get 


d 
eP@y! + cP p(x)y = eP q(x) or equivalently ae 8) = eP()¢(x). 


On integration, we get 


In other words, 
y = ce P@) 4 e“P@) / eP@a(a)dax (7.4.2) 


where c is an arbitrary constant is the general solution of Equation (7.4.1). 


Remark 7.4.3 If we let P(x) = [ p(s)ds in the above discussion, Equation (7.4.2) also represents 


y = y(aje—P) + cP) f cP a( sds. (7.4.3) 


a 


As a simple consequence, we have the following proposition. 


Proposition 7.4.4 y = ce~?(*) (where c is any constant) is the general solution of the linear homogeneous 
equation 
y + p(x)y =0. (7.4.4) 


—ka 


In particular, when p(x) = k, is a constant, the general solution is y = ce~“”, with c an arbitrary constant. 
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Example 7.4.5 = 1. Comparing the equation y’ = y with Equation (7.4.1), we have 
p(w) =—-1 and q(x) =0. 


Hence, P(x) = [(—l)dz = —«. Substituting for P(x) in Equation (7.4.2), we get y = ce” as the 


required general solution. 


We can just use the second part of the above proposition to get the above result, as k = —1. 


2. The general solution of ry’ = —y, x € I (0 ¢ I) is y=ca~', where c is an arbitrary constant. Notice 


that no non-zero solution exists if we insist on the condition lim y=0O. 
x—>0,2>0 


A class of nonlinear Equations (7.4.1) (named after Bernoulli (1654 —1705)) can be reduced to linear 
equation. These equations are of the type 


y’ + p(x)y = a(x)y*. (7.4.5) 


If a = 0 or a = 1, then Equation (7.45) is a linear equation. Suppose that a 4 0,1. We then define 
u(x) = y'~® and therefore 
wu! = (1—a)y'y~® = (1 — a)(q(x) — p(a)u) 
or equivalently 
u’ + (1—a)p(x)u = (1— a)q(z), (7.4.6) 


a linear equation. For illustration, consider the following example. 


Example 7.4.6 For m,n constants and m 4 0, solve y’ — my + ny? = 0. 
Solution: Let u = y~!. Then u(z) satisfies 


u +mu=n 


and its solution is 
n 
m 
Equivalently 


1 


Yom 
Acme 


with m # 0 and A an arbitrary constant, is the general solution. 


Exercise 7.4.7 — 1. In Example[?.4.6] show that u’ + mu =n. 


2. Find the genral solution of the following: 


3. Solve the following IVP’s: 


(a) y’ — 4y =5, y(0) =0. 
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(b) y+ (1+27)y = 3, y(0) =0. 
(c) y +y =cosa, y(m) =0. 

(d) y’—y? =1, y(0) =0. 

(e) (1+2)y’ + y = 227, y(1) =1. 


4. Let y; be a solution of y/ +. a(2)y = bi(x) and yp be a solution of y/ + a(2)y = bo(a). Then show that 
y1 + y2 is a solution of 


y' + a(x)y = bi (x) + ba(2). 


5. Reduce the following to linear equations and hence solve: 


(a) y+ 2y = y?. 

(b) (wy + #e¥)y! = y?. 

(c) y’sin(y) + xcos(y) = a. 
(d) y’-y=ay?. 


6. Find the solution of the IVP 
y' +4ay+ay* =0, y(0) = 


sim 


7.5 Miscellaneous Remarks 


In Section [7.4] we have learned to solve the linear equations. There are many other equations, though 
not linear, which are also amicable for solving. Below, we consider a few classes of equations which can 
be solved. In this section or in the sequel, p denotes a or y’. A word of caution is needed here. The 
method described below are more or less ad hoc methods. 


1. EQUATIONS SOLVABLE FOR Y: 
Consider an equation of the form 


y = f(a,p). (7.5.1) 
Differentiating with respect to x, we get 
dy Of(x,p) , OF(@,p) dp p 
~=p= +" :«. = ~Oof lent] = —). 7.5.2 
Tp 7? a oe ae g(x,P, +) (7.5.2) 


Equation (7.5.2) can be viewed as a differential equation in p and x. We now assume that Equation 
(7.5.2) can be solved for p and its solution is 


h(a, p,c) = 0. (7.5.3) 


If we are able to eliminate p between Equations (7.5.1) and (7.5.3), then we have an implicit 
solution of the Equation (7.5.i). 
Solve y = 2px — axp?. 


d 
Solution: Differentiating with respect to x and replacing = by p, we get 
x 
dp dp dp 
= 2p — p*? + 2a — — — —)(1—p)=0. 
p= 2p p+ 2a 2xp— or (p+2e—)(1—p p) =0 
So, either 


d, 
pap 0 or p=1. 
dx 
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That is, either p?x = c or p = 1. Eliminating p from the given equation leads to an explicit solution 


y = 2x Seng or y=. 
Vx 


The first solution is a one-parameter family of solutions, giving us a general solution. The latter 


one is a solution but not a general solution since it is not a one parameter family of solutions. 


2. EQUATIONS IN WHICH THE INDEPENDENT VARIABLE 2 IS MISSING: 
These are equations of the type f(y, p) = 0. If possible we solve for y and we proceed. Sometimes 
introducing an arbitrary parameter helps. We illustrate it below. 


Solve y? + p? = a? where a is a constant. 
Solution: We equivalently rewrite the given equation, by (arbitrarily) introducing a new param- 
eter t by 

y=asint, p=acost 


from which it follows 


dy dy dy /dz 
—_—_ = t: — _—_— 
dp ORs ee AEP ae 
and so d id 
z ] 
—>= Ee — l 1 = . 
di Ba or x=t+c 


Therefore, a general solution is 


y = asin(t +c). 
3. EQUATIONS IN WHICH y (DEPENDENT VARIABLE OR THE UNKNOWN) IS MISSING: 
We illustrate this case by an example. 


Find the general solution of « = p? — p—1. 
Solution: Recall that p= gy Now, from the given equation, we have 


d dy dz 
FS = (3p? -1) 
dp dx dp 
Therefore, 
soy = 1 ee 
y= qP 5? 


(regarding p as a parameter). The desired solution in this case is in the parametric form, given by 


3 1 
=??-t-1 andy=<=t*— =t? 
ay and y q 5 +c¢ 


where c is an arbitrary constant. 


Remark 7.5.1 The readers are again informed that the methods discussed in 1),2),3) are more 


or less ad hoc methods. It may not work in all cases. 


Exercise 7.5.2. 1. Find the general solution of y = (1+ p)x +p”. 
d. 
Hint: Differentiate with respect to x to get —_ = —(x + 2p) (a linear equation in x). Express the 
p 


solution in the parametric form 
y(p) = (L+p)e+p*, 2(p) =2(1—p) + ce™?. 
2. Solve the following differential equations: 


(a) 8y =a? +p’. 
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(b) ytap=z*p". 
(c 
(d 
(e) 2y = 22? + 4px + p?. 


y? logy — p* = 2xyp. 


) 
) 
) Qy + p* + 2p = 22(p +1). 
) 


7.6 Initial Value Problems 


As we had seen, there are no methods to solve a general equation of the form 


y' = f(x,y) (7.6.1) 
and in this context two questions may be pertinent. 
1. Does Equation (7.6.1) admit solutions at all (i.e., the existence problem)? 


2. Is there a method to find solutions of Equation (7.6.1) in case the answer to the above question is 
in the affirmative? 


The answers to the above two questions are not simple. But there are partial answers if some 
additional restrictions on the function f are imposed. The details are discussed in this section. 
For a,b € R with a > 0,b > 0, we define 


S={(x,y) € R?: |x —29| <a, |y—yo| <b} CIXR. 


Definition 7.6.1 (Initial Value Problems) Let f : S —> R be a continuous function on a S. The problem 
of finding a solution y of 


y' = f(x,y), (ty) €S,2ET with y(xo) = yo (7.6.2) 


in a neighbourhood J of xo (or an open interval J containing 2) is called an Initial Value Problem, henceforth 
denoted by IVP. 


The condition y(xo) = yo in Equation (7.6.2) is called the INITIAL CONDITION stated at x = xq and yo 
is called the INITIAL VALUE. 
Further, we assume that a and 0 are finite. Let 


M = max{|f(2,y)| : (wy) € S}. 


Such an M exists since S is a closed and bounded set and f is a continuous function and let h = 


min(a, +). The ensuing proposition is simple and hence the proof is omitted. 


Proposition 7.6.2 A function y is a solution of IVP (7.6.2) if and only if y satisfies 
Y= Yo +f f(s, y(s))ds. (7.6.3) 
xo 


In the absence of any knowledge of a solution of IVP (7.6.2), we now try to find an approximate 
solution. Any solution of the IVP (7.6.2) must satisfy the initial condition y(xo) = yo. Hence, as a crude 
approximation to the solution of IVP (7.6.2), we define 


yo = yo for all x € [ap — h, ao + Al. 
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Now the Equation (7.6.3) appearing in Proposition helps us to refine or improve the approximate 
solution yo with a hope of getting a better approximate solution. We define 


Y1 = Yo +f f(s, yo)ds 
Zo 
and for n = 2,3,..., we inductively define 
Yn = Yo +f f (8, Yn—1(s))ds for all x € [ao — h, xo +A]. 
xo 


As yet we have not checked a few things, like whether the point (s,y,(s)) € S or not. We formalise 
the theory in the latter part of this section. To get ourselves motivated, let us apply the above method 
to the following IVP. 


Example 7.6.3 Solve the IVP 


Solution: From Proposition a function y is a solution of the above IVP if and only if 


y=1 -[ y(s)ds. 


0 


n=1-f ds =1—-~2. 
0 
2 


x 


yg=1—] (l-s)ds=1l-2+—. 
0 2! 


We have yo = y(0) =1 and 
So, 


By induction, one can easily verify that 


x? x? av 


Yn =lo-et a —ap te +l)". 
Note: The solution of the given IVP is 


y=e and that lim y, =e”. 
nm—> oo 


This example justifies the use of the word approximate solution for the y,’s. 
We now formalise the above procedure. 


Definition 7.6.4 (Picard’s Successive Approximations) Consider the IVP (7.6.2). For x € I with |x — 


xo| <a, define inductively 
yo(x) = yo and for n=1,2,..., 


Yn(@) = w+ | Fs.1als))ds. (7.6.4) 


Then yo, Y1;---;Yn;-+- are called Picard’s successive approximations to the IVP (7.6.2). 


Whether Equation (7.6.4) is well defined or not is settled in the following proposition. 


Proposition 7.6.5 The Picard’s approximates y,,'s, for the IVP defined by Equation (7.6.4) is well 
defined on the interval |2 — xo| < h = min{a, TF}, i.e., for x € [ao —h, xo +h]. 
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Proor. We have to verify that for each n = 0,1,2,..., ($8, Yn) belongs to the domain of definition of f 
for |s—2o| < h. This is needed due to the reason that f(s, yn) appearing as integrand in Equation 
may not be defined. For n = 0, it is obvious that f(s, yo) € S as |s — xo| < a and |yo — yo| = 0 < b. For 
n = 1, we notice that, if | — x| < h then 


lan — yol| < Mla —20| < Mh <b. 


So, (a, yi) € S whenever |x — xo| < h. 
The rest of the proof is by the method of induction. We have established the result for n = 1, namely 


(x,yi)ES if |ju—aol <h. 
Assume that for k = 1,2,...,2—1, (x, yx) € S whenever |a — xo| < h. Now, by definition of y,, we have 
Yn — Yo = / (8, Yn—1)ds. 
Xo 
But then by induction hypotheses (s, y,-1) € S and hence 
lyn — yo| < Mla—ao]< Mh<b. 


This shows that (2, yn) € S whenever |x — xo| < h. Hence (2, yx) € S for k =n holds and therefore the 


proof of the proposition is complete. 


Let us again come back to Example|[7.6.3]in the light of Proposition [7.6.2 


Example 7.6.6 Compute the successive approximations to the IVP 


y =-y, -l1<ax<1, ly—-1|<1and y(0)=1. (7.6.5) 


Solution: Note that 7 = 0, yo = 1, f(x,y) = —y, and a = b = 1. The set S on which we are studying the 
differential equation is 
S={(a,y):|2]<1,ly— 1] < 1}. 


By Proposition [7.6.2] on this set 

M = max{ly|: (v,y)€ S} =2 and A =min{1,1/2} =1/2. 

: ‘ : : : 11,. spi 
Therefore, the approximate solutions y,,’s are defined only for the interval ley 5h if we use Proposition 
(6.2 


Observe that the exact solution y = e~* and the approximate solutions y,,’s of Example [7.6.3] exist 


on [—1, 1]. But the approximate solutions as seen above are defined in the interval [3 5: 
That is, for any IVP, the approximate solutions y,,’s may exist on a larger interval as compared to 
the interval obtained by the application of the Proposition [7.6.2 


We now consider another example. 
Example 7.6.7 Find the Picard’s successive approximations for the IVP 
y =fl(y), O<a#<1, y>0 and y(0) =0; (7.6.6) 


where 
f(y) = Vy for y = 0. 
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Solution: By definition yo(a) = yo = 0 and 


n(e) =m +f Fluo)as=0+ f” voas =o. 


A similar argument implies that y,(2) = 0 for all n = 2,3,... and lim y,(x) = 0. Also, it can be easily 
n—>+ oo 
verified that y(a) = 0 is a solution of the IVP (7.6.6). 
2 2 


Also y(x) = = 0 <a <1isa solution of Equation (7.6.6) and the {y,,}’s do not converge to = Note 
here that the IVP (7.6.6) has at least two solutions. 


The following result is about the existence of a unique solution to a class of IVPs. We state the 
theorem without proof. 


Theorem 7.6.8 (Picard’s Theorem on Existence and Uniqueness) Let S = {(z,y) : |” —2o| <a, |y— 
a) 
yo| < b}, and a,b > 0. Let f : S—>R be such that f as well as i are continuous on S. Also, let MW, kK €R 
y 


be constants such that 3 
It <M, 2) <K on 8. 
Oy 


Let h = min{a,b/M}. Then the sequence of successive approximations {y,,} (defined by Equation (7.6.4) 
for the IVP (7.6.2) uniformly converges on |x — 2o| < h to a solution of IVP (7.6.2). Moreover the solution 
to IVP (7.6.2) is unique. 


Remark 7.6.9 The theorem asserts the existence of a unique solution on a subinterval |x — xo| < h of 
the given interval |x — xo| < a. In a way it is in a neighbourhood of xo and so this result is also called 
the local existence of a unique solution. A natural question is whether the solution exists on the whole 
of the interval |~ — xo| < a. The answer to this question is beyond the scope of this book. 

Whenever we talk of the Picard’s theorem, we mean it in this local sense. 


Exercise 7.6.10 1. Compute the sequence {y,,} of the successive approximations to the IVP 
y' =y (y—1), (to) =0, 20 = 0. 
2. Show that the solution of the IVP 


y =y &w—D), y(eo) = 1,20 20 
isy=1, 27> 20. 


3. The IVP 
y' = Vy, y(0) =0,2 > 0 
has solutions y; = 0 as well as yo = 12 > 0. Why does the existence of the two solutions not 
contradict the Picard’s theorem? 
4. Consider the IVP 
y =y, y(0) =1 in {(z,y) : |z| < a,|yl < 5} 


for any a,b > 0. 


(a) Compute the interval of existence of the solution of the IVP by using Theorem [7.6.8 
(b) Show that y = e” is the solution of the IVP which exists on whole of R. 


This again shows that the solution to an IVP may exist on a larger interval than what is being implied 
by Theorem 
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7.6.1 Orthogonal Trajectories 


One among the many applications of differential equations is to find curves that intersect a given family 
of curves at right angles. In other words, given a family F, of curves, we wish to find curve (or curves) 
I which intersect orthogonally with any member of F’ (whenever they intersect). It is important to 
note that we are not insisting that I should intersect every member of F, but if they intersect, the 
angle between their tangents, at every point of intersection, is 90°. Such a family of curves T is called 
“orthogonal trajectories” of the family F. That is, at the common point of intersection, the tangents are 
orthogonal. In case, the family F and Fy» are identical, we say that the family is self-orthogonal. 
Before procedding to an example, let us note that at the common point of intersection, the product 
of the slopes of the tangent is —1. In order to find the orthogonal trajectories of a family of curves 
F, parametrized by a constant c, we eliminate c between y and y’. This gives the slope at any point 
(x,y) and is independent of the choice of the curve. Below, we illustrate, how to obtain the orthogonal 


trajectories. 
Example 7.6.11 Compute the orthogonal trajectories of the family F’ of curves given by 
F: oy =ex’, (7.6.7) 
where c is an arbitrary constant. 
Solution: Differentiating Equation (7.6.7), we get 
2yy’ = 3cx?. (7.6.8) 
Elimination of c between Equations (7.6.7) and (7.6.8), leads to 


= — ==, 7.6.9 
2y 2u sy 2x ( ) 


At the point (a, y), if any curve intersects orthogonally, then (if its slope is y’) we must have 


Solving this differential equation, we get 
2 


aaa 
=—-—++4.. 
y 3 


Or equivalently, y? + x =c is a family of curves which intersects the given family F' orthogonally. 
Below, we summarize how to determine the orthogonal trajectories. 
Step 1: Given the family F(x, y,c) = 0, determine the differential equation, 


y' = f(x,y), (7.6.10) 


for which the given family F' are a general solution. Equation (7.6.10) is obtained by the elimination of 
the constant c appearing in F'(2,y,c) = 0 “using the equation obtained by differentiating this equation 
with respect to x”. 

Step 2: The differential equation for the orthogonal trajectories is then given by 


(7.6.11) 


Final Step: The general solution of Equation (7.6.11) is the orthogonal trajectories of the given family. 
In the following, let us go through the steps. 
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Example 7.6.12 Find the orthogonal trajectories of the family of stright lines 
y=met+l1, (7.6.12) 


where m is a real parameter. 


Solution: Differentiating Equation (7.6.12), we get y’ = m. So, substituting m in Equation (7.6.12), we 
have y = y’x + 1. Or equivalently, 


So, by the final step, the orthogonal trajectories satisfy the differential equation 


a (7.6.13) 
I-y 
It can be easily verified that the general solution of Equation (7.6.13) is 
x? +y*—2y=c, (7.6.14) 


where c is an arbitrary constant. In other words, the orthogonal trajectories of the family of straight 
lines (7.6.12) is the family of circles given by Equation (7.6.14). 


Exercise 7.6.13 1. Find the orthogonal trajectories of the following family of curves (the constant c 
appearing below is an arbitrary constant). 


2. Show that the one parameter family of curves y? = 4k(k + x), k € R are self orthogonal. 


3. Find the orthogonal trajectories of the family of circles passing through the points (1, —2) and (1, 2). 


7.7 Numerical Methods 


All said and done, the Picard’s Successive approximations is not suitable for computations on computers. 
In the absence of methods for closed form solution (in the explicit form), we wish to explore “how 
computers can be used to find approximate solutions of IVP” of the form 


y’ = f(z,y), y(%o) = Yo. (7.7.1) 


In this section, we study a simple method to find the “numerical solutions” of Equation (77.1). The 
study of differential equations has two important aspects (among other features) namely, the qualitative 
theory, the latter is called ” Numerical methods” for solving Equation (7.7.1). What is presented here is 
at a very rudimentary level nevertheless it gives a flavour of the numerical method. 

To proceed further, we assume that f is a “good function” (there by meaning “sufficiently differen- 
tiable”). In such case, we have 


h? 
yet h)=y thy! + sy" +. 
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XO XY v2 Ln — 2 


Figure 7.1: Partitioning the interval 


which suggests a “crude” approximation y(a +h) ~ y+hf(x,y) (if h is small enough), the symbol ~ 
means “approximately equal to”. With this in mind, let us think of finding y, where y is the solution of 
Equation (7.7.1) with x > 2. Let h = 77 70 and define 


nr 


“i= +th, 1=0,1,2,...,n. 


That is, we have divided the interval [xo, 2] into n equal intervals with end points 7,71,...,0 = Lp. 
Our aim is to calculate y: At the first step, we have y(w +h) ~ yo + hf (xo, yo)- Define yi = 
yo + hf (xo, yo). Error at first step is 


ly(zo + h) — ys | = Fa. 


Similarly, we define y2 = y1 + hf (x1, y1) and we approximate y(xo + 2h) = y(#2) ~ yi thf (x1, y1) = yo 
and so on. In general, by letting yx = yx—-1 + Af (Le—1, Ye-1), we define (inductively) 


y(xo + (k + 1)h) = Yk+1 ~ yn + hf (GK, Ue) k=0,1,2,...,n—1. 


This method of calculation of yi, y2,.--,Yn is called the Euler’s method. The approximate solution of 
Equation (77.1) is obtained by “linear elements” joining (x0, yo), (1, Y1),--+3 (Ins Yn): 


Figure 7.2: Approximate Solution 
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CHAPTER 7. DIFFERENTIAL EQUATIONS 


Chapter 8 


Second Order and Higher Order 


Equations 


8.1 Introduction 


Second order and higher order equations occur frequently in science and engineering (like pendulum 
problem etc.) and hence has its own importance. It has its own flavour also. We devote this section for 


an elementary introduction. 
Definition 8.1.1 (Second Order Linear Differential Equation) The equation 
p(x)y” +. q(x)y' +r(x)y=c(xz), rel (8.1.1) 


is called a SECOND ORDER LINEAR DIFFERENTIAL EQUATION. 

Here J is an interval contained in R; and the functions p(-), q(-),7(-), and c(-) are real valued continuous 
functions defined on R. The functions p(-), ¢(-), and r(-) are called the coefficients of Equation and 
c(x) is called the non-homogeneous term or the force function. 

Equation is called linear homogeneous if c(2) = 0 and non-homogeneous if c(x) 4 0. 

Recall that a second order equation is called nonlinear if it is not linear. 


Example 8.1.2 1. The equation 


is a second order equation which is nonlinear. 
2. y” — y = 0 is an example of a linear second order equation. 
Wat eet h i d ; 
3. y”+y'+y =sinz is a non-homogeneous linear second order equation. 


4. ax?y"” + bry’ + cy = 0c £0 is a homogeneous second order linear equation. This equation is called 


EULER EQUATION OF ORDER 2. Here a,b, and c are real constants. 


Definition 8.1.3 A function y defined on I is called a solution of Equation (8.1.1) if y is twice differentiable 
and satisfies Equation (8.1.1). 


Example 8.1.4 — 1. e®” and e-® are solutions of y” — y = 0. 
2. sinx and cos are solutions of y” + y = 0. 
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We now state an important theorem whose proof is simple and is omitted. 


Theorem 8.1.5 (Superposition Principle) Let y; and y2 be two given solutions of 
p(x)y” + q(x)y’ +r(x)y =0, rel. (8.1.2) 
Then for any two real number cy, C2, the function cyy1 + C2y2 is also a solution of Equation (8.1.2). 


It is to be noted here that Theorem|8.1.5/is not an existence theorem. That is, it does not assert the 
existence of a solution of Equation (8.1.2). 


Definition 8.1.6 (Solution Space) The set of solutions of a differential equation is called the solution space. 


For example, all the solutions of the Equation form a solution space. Note that y(z) = 0 is 
also a solution of Equation (8.1.2). Therefore, the solution set of a Equation is non-empty. A 
moments reflection on Theorem [8.1.5] tells us that the solution space of Equation forms a real 
vector space. 


Remark 8.1.7 The above statements also hold for any homogeneous linear differential equation. That 


is, the solution space of a homogeneous linear differential equation is a real vector space. 


The natural question is to inquire about its dimension. This question will be answered in a sequence 
of results stated below. 
We first recall the definition of Linear Dependence and Independence. 


Definition 8.1.8 (Linear Dependence and Linear Independence) Let J be an interval in R and let f,g: 
I —+ R be continuous functions. we say that f,g are said to be linearly dependent if there are real numbers 
a and b (not both zero) such that 


af(t) + bg(t)=0 forall te J. 
The functions f(-),g(-) are said to be linearly independent if f(-), g(-) are not linear dependent. 


To proceed further and to simplify matters, we assume that p(z) = 1 in Equation (8.1.2) and that 
the function g(x) and r(a) are continuous on I. 
In other words, we consider a homogeneous linear equation 


y" +q(x)y'+r(x)jy=0, rel, (8.1.3) 


where qg and r are real valued continuous functions defined on I. 
The next theorem, given without proof, deals with the existence and uniqueness of solutions of 
Equation (8.1.3) with initial conditions y(xo) = A, y/(vo) = B for some xo € I. 


Theorem 8.1.9 (Picard’s Theorem on Existence and Uniqueness) Consider the Equation (8.1.3) along 
with the conditions 
y(to) = A, y'(ao) = B, for some arp € I (8.1.4) 


where A and B are prescribed real constants. Then Equation (8.1.3), with initial conditions given by Equation 
has a unique solution on J. 


A word of Caution: NOTE THAT THE COEFFICIENT OF y” IN EQUATION (8.1.3) IS 1. BEFORE 
WE APPLY THEOREM WE HAVE TO ENSURE THIS CONDITION. 

An important application of Theorem|8.1.9]is that the equation has exactly 2 linearly inde- 
pendent solutions. In other words, the set of all solutions over R, forms a real vector space of dimension 
2. 
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Theorem 8.1.10 Let g and r be real valued continuous functions on J. Then Equation (8.1.3) has exactly 
two linearly independent solutions. Moreover, if y; and y2 are two linearly independent solutions of Equation 
(8.1.3), then the solution space is a linear combination of y; and yo. 


Proor. Let y; and y2 be two unique solutions of Equation (8.1.3) with initial conditions 
yi(to) = 1, yi (zo) = 0, and ye(ro) =0, yb (xo) =1 for some zo € I. (8.1.5) 


The unique solutions y; and y2 exist by virtue of Theorem [8.1.9] We now claim that y; and y2 are 
linearly independent. Consider the system of linear equations 


ayi (x) + Bye(x) = 0, (8.1.6) 


where a and 8 are unknowns. If we can show that the only solution for the system (8.1.6) is a = 8 = 0, 
then the two solutions y; and ye will be linearly independent. 

Use initial condition on y; and y2 to show that the only solution is indeed a = 6 = 0. Hence the 
result follows. 

We now show that any solution of Equation (8.1.3) is a linear combination of y; and yz. Let ¢ be 
any solution of Equation (8.1.3) and let d; = ¢(xo) and dz = ¢’(xo). Consider the function ¢ defined by 


b(x) = diys(x) + doy2(z). 


By Definition 8.1.3] ¢ is a solution of Equation (8.1.3). Also note that 4(ao) = d; and @'(x9) = dg. So, 
and ¢ are two solution of Equation (8.1.3) with the same initial conditions. Hence by Picard’s Theorem 
on Existence and Uniqueness (see Theorem[8.1.9), ¢(a) = ¢(x) or 


C(x) = diyi (x) + doy2(z). 


Thus, the equation (8.1.3) has two linearly independent solutions. 


Remark 8.1.11 1. Observe that the solution space of Equation (8.1.3) forms a real vector space of 


dimension 2. 
2. The solutions y, and y2 corresponding to the initial conditions 
yi(to) = 1, yi (vo) = 0, and yo(ro) =0, y5(xo) =1 for some zo € TJ, 
are called a FUNDAMENTAL SYSTEM of solutions for Equation (8.1.3). 


3. Note that the fundamental system for Equation (8.1.3) is not unique. 


Consider a 2 x 2 non-singular matrix A = with a,b,c,d € R. Let {y1, yo} be a fundamental 


b 
d 
system for the differential Equation and y' = [y1, y2]. Then the rows of the matrix Ay = 
ay, + by2 
cyi + dye 
system for Equation [8.1.3] then {ay + by2,cy1 + dy2} is also a fundamental system whenever 
ad — bc = det(A) £0. 


also form a fundamental system for Equation[8.1.3| That is, if {y1, yo} is a fundamental 


Example 8.1.12 {1,x} is a fundamental system for y’” = 0. 


1 -1 
Note that {1 — x, 1 +x} is also a fundamental system. Here the matrix is F ; | : 


Exercise 8.1.13 1. State whether the following equations are SECOND-ORDER LINEAR or SECOND- 
ORDER NON-LINEAR equaitons. 
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(a) y” +ysing =5. 

(b) y” + (y’)? + ysinz = 0. 

(c) y” + yy! = —2. 

(d) (a? + 1)y” + (@? + 1)?y’ — By = sina. 


2. By showing that y; = e” and yo = e-™ are solutions of 


conclude that sinha and cosh are also solutions of y” — y = 0. Do sinha and cosh form a 


fundamental set of solutions? 


3. Given that {sin x, cos x} forms a basis for the solution space of y” + y = 0, find another basis. 


8.2 More on Second Order Equations 


In this section, we wish to study some more properties of second order equations which have nice 


applications. They also have natural generalisations to higher order equations. 


Definition 8.2.1 (General Solution) Let y; and y2 be a fundamental system of solutions for 
y +q(z)y +r(z)y =0, cel. (8.2.1) 
The general solution y of Equation is defined by 
Y=C1yi + coy2, ETL 
where c; and cz are arbitrary real constants. Note that y is also a solution of Equation (8.2.1). 


In other words, the general solution of Equation (8.2.1) is a 2-parameter family of solutions, the 


parameters being c, and co. 


8.2.1  Wronskian 


In this subsection, we discuss the linear independence or dependence of two solutions of Equation (8.2.1). 


Definition 8.2.2 (Wronskian) Let y; and yo be two real valued continuously differentiable function on an 
interval J C R. For x € I, define 


yi Yh 
yo Yd 
= Y1¥o— ViY2- 


W(y1, y2) = 


W is called the Wronskian of y; and yo. 


Example 8.2.3. 1. Let y; =cosz and yo =sinz, x € I CR. Then 


sinz  cosz 


W (y1, y2) = =-1 forall vel. (8.2.2) 


cosx —sing 


Hence {cos x, sin} is a linearly independent set. 
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2. Let y: = x?|a|, and yo = x? for x € (—1,1). Let us now compute y/ and y5. From analysis, we know 


that y, is differentiable at « = 0 and 
yi(x) = —3a” if <0 and y,(x) = 32? if x >0. 


Therefore, for x > 0, 


my yi|_ jae 32? 
4 = — — 0 
W(y1, 42) gel le ae 
and for x < 0, 
/ 3 2 
Ww iY Yip fe? 8a) 
(My, y2) _ y, a 73 372 = 0. 


That is, for all 2 € (—1,1), W(y1,y2) = 0. 


It is also easy to note that y1, ye are linearly independent on (—1, 1). In fact,they are linearly independent 


on any interval (a,b) containing 0. 


Given two solutions y; and yg of Equation (8.2.1), we have a characterisation for y; and y2 to be 


linearly independent. 


Theorem 8.2.4 Let J C R be an interval. Let y; and y2 be two solutions of Equation (8.2.1). Fix a point 


ao € I. Then for any x € J, 


x 


W(y1,¥2) = Wr ¥2) (20) exp(— / a(s)ds). 


Xo 
Consequently, 
W (y1, y2)(to) AO => W(y1, 42) £0 forall we J. 


Proor. First note that, for any x € J, 


W (yi, y2) = y1¥o — YiYr- 


So 


“Wus.u2) = yiys — yi yo 

= yr (—a(x)ys — r(x)y2) — (-a(e)y, — r(x)y1) yo 
= g(a) (viy2 — y1¥o) 
= —9(x)W(y1, 42). 


So, we have 


x 


W(y1, 2) = W (1, y2)(0) exp(— | q(s)ds). 


xo 


This completes the proof of the first part. 


(8.2.3) 


The second part follows the moment we note that the exponential function does not vanish. Alter- 


natively, W(y1, y2) satisfies a first order linear homogeneous equation and therefore 


W(y1,y2) =0 if and only if W(y1, y2)(ao) = 0. 


Remark 8.2.5 — 1. If the Wronskian W(y1,y2) of two solutions y,y2 of (8.2.1) vanish at a point 


xo € I, then W(y1, y2) is identically zero on I. 
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2. If any two solutions yi, y2 of Equation (8.2.1) are linearly dependent (on I), then W(y1,y2) = 0 
on I. 


Theorem 8.2.6 Let y; and y2 be any two solutions of Equation (8.2.1). Let xp € I be arbitrary. Then y; 
and yp are linearly independent on J if and only if W(y1, y2)(xo) 4 0. 


PrRooF. Let y1, yo be linearly independent on I. 
To show: W(y1, y2)(ao) £ 0. 
Suppose not. Then W(y1, y2)(ao) = 0. So, by Theorem the equations 


c1yi(%o0) + coy2(%o) =0 and cy y} (xo) + cays (xo) = 0 (8.2.8) 
admits a non-zero solution dy, dz. (as 0 = W(y1, y2)(Zo) = y1(%0) yb (20) — yi (to) y2(20).) 
Let y = diy + dy. Note that Equation (8.2.8) now implies 
y(zo) =0 and y'(xo) = 0. 


Therefore, by Picard’s Theorem on existence and uniqueness of solutions (see Theorem [8.1.9), the solu- 
tion y = 0 on I. That is, diy: + dayo2 = 0 for all x € I with |d;| + |dg| 4 0. That is, y1, ye is linearly 
dependent on J. A contradiction. Therefore, W(y1, y2)(xo0) # 0. This proves the first part. 

Suppose that W(y1, y2)(@0) # 0 for some zp € I. Therefore, by Theorem [8.2.4] W(yi, y2) 4 0 for all 
x € I. Suppose that cyyi (a) + c2y2(a) = 0 for all « € I. Therefore, cyy{ (x) + coy$(x) = 0 for all x € I. 
Since xg € J, in particular, we consider the linear system of equations 


c1yi(xo) + coye(%o) = 0 and cry; (x0) + c2y(xo) = 0. (8.2.9) 


But then by using Theorem [2.6.1] and the condition W(y1, y2)(zo) # 0, the only solution of the linear 
system (8.2.9) is cy = cg = 0. So, by Definition [8.1.8] y1, y2 are linearly independent. 


Remark 8.2.7 Recall the following from Example} 
1. The interval I = (—1,1). 
2. y. =27|z|, yo = 2° and W(y1, ye) = 0 for allz € I. 
3. The functions y; and y2 are linearly independent. 


This example tells us that Theorem[8.2.6] may not hold if y, and y2 are not solutions of Equation (8.2.1) 
but are just some arbitrary functions on (—1,1). 


The following corollary is a consequence of Theorem [8.2.6] 


Corollary 8.2.8 Let y1, yo be two linearly independent solutions of Equation (8.2.1). Let y be any solution 
of Equation (8.2.1). Then there exist unique real numbers d1, dz such that 


Y= diy + d2y2 on I. 


Proor. Let 2o € I. Let y(xo) = a, y/(ao) = b. Here a and b are known since the solution y is given. 
Also for any zo € I, by Theorem[8.2.6) W(y1, y2)(vo) 4 0 as y1, y2 are linearly independent solutions of 
Equation (8.2.1). Therefore by Theorem [2.6.1] the system of linear equations 


e1yi(%o) + coy2(a%o) =a and cry} (xo) + cays(xo) = bd (8.2.10) 


has a unique solution dj, do. 
Define ¢(x) = diy1 + doy for x € I. Note that ¢ is a solution of Equation (8.2.1) with ¢(r9) = a and 
¢'(ao) = b. Hence, by Picard’s Theorem on existence and uniqueness (see Theorem [8.1.9), ¢ = y for all 
a € J. That is, y = diy, + doy. 
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Exercise 8.2.9 1. Let y; and yo be any two linearly independent solutions of y’’ + a(a)y = 0. Find 
W(m1, y2). 


2. Let y1 and yo be any two linearly independent solutions of 
y” +a(x)y’ + b(a)y =0, x ET. 
Show that y; and ye cannot vanish at any = 2% € J. 
3. Show that there is no equation of the type 
y” +.a(x)y’ + b(x)y = 0, x € [0, 27] 


admiting y; = sinx and yz = x—7 as its solutions; where a(a) and b(a) are any continuous functions 
on [0,27]. [Hint: Use Exercise[8.2.92)] 


8.2.2 Method of Reduction of Order 


We are going to show that in order to find a fundamental system for Equation (8.2.1), it is sufficient to 
have the knowledge of a solution of Equation (8.2.1). In other words, if we know one (non-zero) solution 
y1 of Equation (8.2.1), then we can determine a solution y2 of Equation (8.2.1), so that {y1, y2} forms 
a fundamental system for Equation (8.2.1). The method is described below and is usually called the 
method of reduction of order. 


Let y1 be an every where non-zero solution of Equation (8.2.1). Assume that y2 = u(x)y1 is a solution 
of Equation (8.2.1), where u is to be determined. Substituting yz in Equation (8.2.1), we have (after a 
bit of simplification) 


wy + ul(Qy, + pyr) + u(y + py + ay) = 0. 
By letting u’ = v, and observing that y; is a solution of Equation (8.2.1), we have 
vy + v(2y} + pyr) = 0 
which is same as 
£ (wy?) = —rov?). 


This is a linear equation of order one (hence the name, reduction of order) in v whose solution is 


vy? = exp(- | p(s)ds), x € I. 


(0) 


Substituting v = u’ and integrating we get 


x 1 s 
u =| sen | p(t)dt)ds, x9 € I 
xo Yy (s) xo 


and hence a second solution of Equation (8.2.1) is 


x 1 s 
n=n [ mUyaRY czp(- | p(t)dt)ds. 


o Yi (s) 0 


It is left as an exercise to show that yi, y2 are linearly independent. That is, {y1, y2} form a funda- 
mental system for Equation (8.2.1). 
We illustrate the method by an example. 
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1 
Example 8.2.10 Given that e y; = —, x > 1 is a solution of 
x 
gy" + 4ay' + 2y = 0, (8.2.11) 


determine another solution y2 of (8.2.11), such that the solutions y1, y2, for x > 1 are linearly independent. 
Solution: With the notations used above, note that zp = 1, p(x) = —, and yo(x) = u(x)yi (x), where u 
x 


ie [ apee(-[ roa) as 


is given by 


l| 
—— 
a 
to 
“i 
w 
x 
fo) 
ta 
se) 
— 
— 
p 
— 
Ww 
hs 
— 
“ 
Q 
DH 


where A and B are constants. So, 


. 1 ; 1 1 1 : : 
Since the term — already appears in y,, we can take y2 = —>. So, — and — are the required two linearly 
; x. x? x x 
independent solutions of (6.2.11). 


Exercise 8.2.11 In the following, use the given solution y1, to find another solution yz so that the two 
solutions y; and yo are linearly independent. 


1. yy” =0, m1 =1, 7>0. 

2. y+ 2y’ +y=0, yi =e", c>0. 
3. 2?y” — ay’ +y=0, yr =a, e>1. 
4. ay’ +y' =0,y%=1, ¢>1. 


5. yy" +ay’-—y=0, y=27, r7> 1. 


8.3. Second Order equations with Constant Coefficients 
Definition 8.3.1 Let a and b be constant real numbers. An equation 

y” +ay' +by =0 (8.3.1) 
is called a SECOND ORDER HOMOGENEOUS LINEAR EQUATION WITH CONSTANT COEFFICIENTS. 


Let us assume that y = e*” to be a solution of Equation (8.3.1) (where \ is a constant, and is to be 
determined). To simplify the matter, we denote 


L(y) = y" + ay’ + by 


and 


p(dA) =? +ar+0b. 


It is easy to note that 
Le" y= nia. 


Now, it is clear that e** is a solution of Equation (83.1) if and only if 


p(A) = 0. (8.3.2) 
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Equation (8.3.2) is called the CHARACTERISTIC EQUATION of Equation (8.3.1). Equation (8.3.2) is a 
quadratic equation and admits 2 roots (repeated roots being counted twice). 


Case 1: Let A1, A2 be real roots of Equation (8.3.2) with A; 4 Az. 
Then e*!” and e** are two solutions of Equation and moreover they are linearly independent 
(since \y # Az). That is, {e*1”, e*2*} forms a fundamental system of solutions of Equation (8.3.1). 
Case 2: Let Ay = Ap be a repeated root of p(A) = 0. 
Then p’(\1) = 0. Now, 


dx 
But p’(A1) = 0 and therefore, 
L(xe™*) = 0. 


Hence, e*!* and xe*” are two linearly independent solutions of Equation (8.3.1). In this case, we have 
a fundamental system of solutions of Equation (8.3.1). 

Case 3: Let \ = a +i6 be a complex root of Equation (8.3.2). 
So, a — if is also a root of Equation (8.3.2). Before we proceed, we note: 


Lemma 8.3.2 Let y = u+ iv be a solution of Equation (8.3.1), where u and v are real valued functions. 
Then wu and v are solutions of Equation (8.3.1). In other words, the real part and the imaginary part of a 
complex valued solution (of a real variable ODE Equation (8.3.1)) are themselves solution of Equation (8.3.1). 


PROOF. exercise. 


Let X= a+i@ be a complex root of p(A) = 0. Then 
e“* (cos(Bx) + isin(Gx)) 


is a complex solution of Equation (8.3.1). By Lemma [.3.2] y, = e** cos(Gx) and yg = sin(Bx) are 
solutions of Equation (8.3.1). It is easy to note that y; and y2 are linearly independent. It is as good as 
saying {e*” cos(3x), e** sin(Gx)} forms a fundamental system of solutions of Equation (8.3.1). 


Exercise 8.3.3. 1. Find the general solution of the follwoing equations. 


(a) y” — 4y' + 3y = 0. 

(b) 2y” + 5y = 0. 

(0) y"—9y=0. 

(d) y” + k?y =0, where k is a real constant. 


2. Solve the following IVP’s. 


(a) y +y=0, y(0) =0, y’(0) =1 

(b) y” —y=0, y(0) =1, y’(0) = 1. 

(c) y+ 4y =0, y(0) = —1, y'(0) = -3. 
(d) y” + 4y’ + 4y =0, y(0) = 1, y’(0) =0. 


3. Find two linearly independent solutions y; and y2 of the following equations. 
(a) y” —5y =0. 
(b) y” + 6y’ + 5y =0. 
(c) y" + 5y =0. 
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(d) y” + 6y’ + 9y = 0. Also, in each case, find W(y1, y2). 
4. Show that the IVP 
y" +y=0, y(0)=0 and y'(0)=B 
has a unique solution for any real number B. 
5. Consider the problem 
y’ +y=0, y(0)=0 and y'(r) = B. (8.3.3) 
Show that it has a solution if and only if B = 0. Compare this with Exercise [4] Also, show that if 
B =0, then there are infinitely many solutions to (8.3.3). 


8.4 Non Homogeneous Equations 


Throughout this section, J denotes an interval in R. we assume that q(-),r(-) and f(-) are real valued 
continuous function defined on J. Now, we focus the attention to the study of non-homogeneous equation 
of the form 
y+ q(a)y! + r(x)y = f(a). (8.4.1) 
We assume that the functions q(-),r(-) and f(-) are known/given. The non-zero function f(-) in 
(8.4.1) is also called the non-homogeneous term or the forcing function. The equation 


y +q(a)y’ +r(x)y =0. (8.4.2) 


is called the homogeneous equation corresponding to (8.4.1). 
Consider the set of all twice differentiable functions defined on J. We define an operator LD on this 
set. by 
L(y) = y" + a(a)y! + r(x)y. 
Then and can be rewritten in the (compact) form 


L(y) hi 
L(y) 0. 


The ensuing result relates the solutions of (8.4.1) and (8.4.2). 


Theorem 8.4.1. — 1. Let y; and yz be two solutions of on I. Then y = yi — ye is a solution of 
(8.4.2). 


2. Let z be any solution of (8.4.1) on J and let z, be any solution of (8.4.2). Then y = z+ 21 is a solution 
of (8.4.1) on J. 


PRooF. Observe that ZL is a linear transformation on the set of twice differentiable function on I. We 
therefore have 


Liyi)=f and Lys) =f. 


The linearity of Z implies that L(y; — y2) = 0 or equivalently, y = y1 — y2 is a solution of (8.4.2). 
For the proof of second part, note that 


L(z)=f and L(z)=0 


implies that 
L(z+ 21) =L(z)+L(a) =f. 


Thus, y = z+ 2 is a solution of (8.4.1). 


The above result leads us to the following definition. 


8.4. NON HOMOGENEOUS EQUATIONS 163 
Definition 8.4.2 (General Solution) A general solution of (8.4.1) on I is a solution of (8.4.1) of the form 
Y=YrtYp, TEL 


where yn = Ciy1 + C2y2 is a general solution of the corresponding homogeneous equation (8.4.2) and y, is 
any solution of (8.4.1) (preferably containing no arbitrary constants). 


We now prove that the solution of (8.4.1) with initial conditions is unique. 


Theorem 8.4.3 (Uniqueness) Suppose that wo € J. Let y; and yp be two solutions of the IVP 

y"+ay +ry=f, y(o) =a, y!(xo) =d. (8.4.5) 
Then yy = ye for all x € I. 
Proor. Let z= yi — yg. Then z satisfies 


L(g} =0, slag) = 0, 2a) = 0. 


By the uniqueness theorem [8.1.9] we have z = 0 on J. Or in other words, y; = y2 on J. 


Remark 8.4.4 The above results tell us that to solve (i.e., to find the general solution of (8.4.1) or the 
IVP (8.4.5), we need to find the general solution of the homogeneous equation (8.4.2) and a particular 
solution y, of (8.4.1). To repeat, the two steps needed to solve (8.4.1), are: 


1. compute the general solution of (8.4.2), and 


2. compute a particular solution of (8.4.1). 


Then add the two solutions. 


Step 1. has been dealt in the previous sections. The remainder of the section is devoted to step 2., 1.e., 
we elaborate some methods for computing a particular solution y, of (8.4.1). 


Exercise 8.4.5 1. Find the general solution of the following equations: 


(a) y” + 5y’ = —5. (You may note here that y = —z is a particular solution.) 


(b) y” —y =—2sing. (First show that y = sin z is a particular solution.) 
2. Solve the following IVPs: 


(a) y” +y =2e”, y(0) =0 = y'(0). (It is given that y = e® is a particular solution.) 


(b) y’—y=—2cosa, y(0)=0, y’(0) = 1. (First guess a particular solution using the idea given in 
Exercise [8.4.5] 15] ) 


3. Let fi (a) and fo(x) be two continuous functions. Let y;’s be particular solutions of 
y+ a(x)y' +r(a)y = fila), 6= 1,2; 


where q(x) and r(x) are continuous functions. Show that y; + y2 is a particular solution of y” + 
q(a)y! + r(a)y = fila) + fol). 
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8.5 Variation of Parameters 


In the previous section, calculation of particular integrals/solutions for some special cases have been 
studied. Recall that the homogeneous part of the equation had constant coefficients. In this section, we 
deal with a useful technique of finding a particular solution when the coefficients of the homogeneous 
part are continuous functions and the forcing function f(a) (or the non-homogeneous term) is piecewise 


continuous. Suppose y; and y2 are two linearly independent solutions of 
y +q(x)y’ +r(x)y =0 (8.5.1) 
on I, where g(x) and r(a) are arbitrary continuous functions defined on J. Then we know that 
Y= C1Yy1 + C2y2 
is a solution of for any constants c; and cp. We now “vary” c; and cz to functions of x, so that 
y=u(r)y1 + v(x)yo, cel (8.5.2) 


is a solution of the equation 
y" +q(x)y' +r(x)y = f(x), on TI, (8.5.3) 


where f is a piecewise continuous function defined on J. The details are given in the following theorem. 


Theorem 8.5.1 (Method of Variation of Parameters) Let g(x) and r(a) be continuous functions defined 
on I and let f be a piecewise continuous function on J. Let yi and yo be two linearly independent solutions 
of (8.5.1) on J. Then a particular solution y,, of (8.5.3) is given by 


ie = ay | Pete fae, (8.5.4) 


where W = W(y1, y2) is the Wronskian of y; and yo. (Note that the integrals in (8.5.4) are the indefinite 
integrals of the respective arguments. ) 


Proor. Let u(x) and v(2) be continuously differentiable functions (to be determined) such that 
Yp =Uyit+vy,, TEL (8.5.5) 
is a particular solution of (8.5.3). Differentiation of leads to 
Yp = Uy + yg + u'yr + v'yr. (8.5.6) 


We choose u and v so that 
uy + v'y2 = 0. (8.5.7) 


Substituting (8.5.7) in (8.5.6), we have 


oo 


Up 


Since y, is a particular solution of (8.5.3), substitution of (8:5.5) and (8:5.8) in (8.5.3), we get 


uy + vys, and y, = uy +vyg tu'y} + v'yp. (8.5.8) 


uly +a(x)yy +r(x)yr) + (ys + a(a)ys +7r(a)y2) + u'y, + o'yg = F(a). 
As y1 and y2 are solutions of the homogeneous equation (8.5.1), we obtain the condition 


uy, +0'ys = f(z). (8.5.9) 
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We now determine wu and v from (8.5.7) and (8.5.9). By using the Cramer’s rule for a linear system of 
equations, we get 
»_ _ yf (x) yi f(x) 


(note that y; and ye are linearly independent solutions of (8.5.1) and hence the Wronskian, W # 0 for 
any x € I). Integration of (8.5.10) give us 


ee and v= [Ae (8.5.11) 


( without loss of generality, we set the values of integration constants to zero). Equations (8.5.11) and 
(8.5.5) yield the desired results. Thus the proof is complete. 


Before, we move onto some examples, the following comments are useful. 


Remark 8.5.2 1. The integrals in (8.5.11) exist, because yz and W(4 0) are continuous functions 
and f is a piecewise continuous function. Sometimes, it is useful to write (8.5.11) in the form 


PRO) s, ana y. [MLO 4, 
=-f wer i Ws) ° 


0 0 


where x € I and 9 is a fixed point in I. In such a case, the particular solution y, as given by 
(8.5.4) assumes the form 


_ * yo(s) f(s) 2 yils) f(s) 
Yp = -n | Wis) 7 + y2 pe Fo” (8.5.12) 


0 


for a fixed point xo € I and for any x € I. 


2. Again, we stress here that, q and r are assumed to be continuous. They need not be constants. 
Also, f is a piecewise continuous function on I. 


3. A word of caution. While using (8.5.4), one has to keep in mind that the coefficient of y” in (8.5.3) 


is 1. 


Example 8.5.3. 1. Find the general solution of 


1 


" 
= —— > 
yery 2+sinz’ me 


Solution: The general solution of the corresponding homogeneous equation y” + y = 0 is given by 
Yn = C1 COS@ + c2sinZ. 


Here, the solutions y; = sinx and yg = cos are linearly independent over I = [0,co) and W = 
W (sin x, cosa) = 1. Therefore, a particular solution, y,, by Theorem [8.5.1] is 


Y2 Y1 
= —- ——— d ——— d 
me nf Be c+ fh . 
. COS & sin x 
= sinx | ———dzr+cosx | ————dz 
[= == 
1 


So, the required general solution is 
y = c, cosx + cgsinz + Yp 


where Yp is given by (8.5.13). 
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2. Find a particular solution of 


z?y" — Qey' + 2y=2°, c>0. 


Solution: Verify that the given equation is 
2 2 
y! —=y' +Sy=e 
x x 


and two linearly independent solutions of the corresponding homogeneous part are y, = x and yo = 2” 


Here 
2 
W =W(a, x?) =|" 7 =n. >0 
1 2¢ 
By Theorem |8.5.1) a particular solution yp is given by 
ae o [Ux 
Yp = —« = dx+ax a 
3 3 
i es 
= 9 + 2 oi 


The readers should note that the methods of Section [8.7] are not applicable as the given equation is 
not an equation with constant coefficients. 


Exercise 8.5.4 1. Find a particular solution for the following problems: 


(a) y’+y= f(z), 09 <1 whee fo) =| : ; : 
2 

(b) y” +y = 2sece for all x € (0, $). 
(c) y” — 3y’ + 2y = —2cos(e"”), x >0. 
(d) 2?y”+2y’-—y=22, x>0. 

2. Use the method of variation of parameters to find the general solution of 
(a) y” —y =—e® for alla ER. 
(b) y’ +y =sing for alla € R. 


3. Solve the following IVPs: 


0) if O<a<l 


ith y(0) =0 = y'(0). 
; joes, “ee y'(0) 


(a) y’ +y= f(x), «> 0 where f(z) -| 


(b) y” — y = |a| for all 2 € [—1, 00) with y(—1) = 0 and y’(—1) = 1. 


8.6 Higher Order Equations with Constant Coefficients 


This section is devoted to an introductory study of higher order linear equations with constant coeffi- 
cients. This is an extension of the study of 2nd order linear equations with constant coefficients (see, 
Section [8.3). 


The standard form of a linear n*4 order differential equation with constant coefficients is given by 


In(y) = f(z) on J, (8.6.1) 


where 
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is a linear differential operator of order n with constant coefficients, a1, @2,...,@n, being real constants 
(called the coefficients of the linear equation) and the function f(x) is a piecewise continuous function 
defined on the interval I. We will be using the notation y for the n‘b derivative of y. If f(x) = 0, then 
(8:6.1) which reduces to 

La(y)=0 on I, (8.6.2) 


is called a homogeneous linear equation, otherwise (8.6.1) is called a non-homogeneous linear equation. 
The function f is also known as the non-homogeneous term or a forcing term. 


Definition 8.6.1 A function y defined on I is called a solution of (8.6.1) if y is n times differentiable and 
y along with its derivatives satisfy (8.6.1). 


Remark 8.6.2 1. If u and v are any two solutions of (8.6.1), then y = u — v is also a solution of 
(8.6.2). Hence, if v is a solution of (8.6.2) and y, is a solution of (8.6.1), then u = v+ Yp is a 
solution of (8.6.1). 


2. Let y; and y2 be two solutions of (8.6.2). Then for any constants (need not be real) ¢1, ca, 


Y=C1y1 + CaYe 
is also a solution of (8.6.2). The solution y is called the superposition of y, and yp. 


3. Note that y = 0 is a solution of (8.6.2). This, along with the super-position principle, ensures that 
the set of solutions of (8.6.2) forms a vector space over R. This vector space is called the SOLUTION 
SPACE or space of solutions of (8.6.2). 


As in Section [8.3] we first take up the study of (8.6.2). It is easy to note (as in Section|[8.3) that for 
a constant 4, 
ie") _ p(d)e** 


where, 
p(d) =A" + aA) +++ + an (8.6.3) 


Definition 8.6.3 (Characteristic Equation) The equation p(\) = 0, where p(A) is defined in (8.6.3), is 
called the CHARACTERISTIC EQUATION of (8.6.2). 


Note that p(A) is of polynomial of degree n with real coefficients. Thus, it has n zeros (counting with 
multiplicities). Also, in case of complex roots, they will occur in conjugate pairs. In view of this, we 
have the following theorem. The proof of the theorem is omitted. 


Theorem 8.6.4 c** is a solution of on any interval J C R if and only if A is a root of 
1. If Ai, A2,---,An are distinct roots of p(\) = 0, then 
en: Sl 
are the n linearly independent solutions of (8.6.2). 
2. If Ay is a repeated root of p(A) = 0 of multiplicity &, i.e., A1 is a zero of (8.6.3) repeated k times, then 
Ma pz pele 


(agers 2 Gearon 


are linearly independent solutions of (8.6.2), corresponding to the root A; of p(A) = 0. 
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3. If \y =a +i is a complex root of p(X) = 0, then so is the complex conjugate A; = a — i. Then the 
corresponding linearly independent solutions of (8.6.2) are 


y, = e°* (cos(8x) +isin(Bx)) and y2 = e®*(cos(Bx) — isin(Bz)). 


These are complex valued functions of x. However, using super-position principle, we note that 


Yi + Y2 
2 


=e" cos(Bx) and a 5 2 _ eon sin(Bx) 


are also solutions of (8.6.2). Thus, in the case of A; = a+ if being a complex root of p(A) = 0, we 
have the linearly independent solutions 


e** cos(6x) and e*” sin(Bx). 


Example 8.6.5 1. Find the solution space of the differential equation 


y” — by" + 11y’ — 6y = 0. 


Solution: Its characteristic equation is 
p(d) = 4° — 6A? + 11, —6 = 0. 


By inspection, the roots of p(A) = 0 are \ = 1, 2,3. So, the linearly independent solutions are e”, e?”, e°* 
and the solution space is 


{cye* + cpe?* + cze** : c1,C2,¢3 € R}. 
2. Find the solution space of the differential equation 


yl!" _ Qy" + y! = 6. 


Solution: Its characteristic equation is 
p(A) = 8 — 207 4+4=0. 


By inspection, the roots of p(A) = 0 are \ = 0,1, 1. So, the linearly independent solutions are 1, e”, xe” 
and the solution space is 


{er + coe” + c3%e” : C1, C2,c3 € R}. 
3. Find the solution space of the differential equation 


yD 4 2y” +y =0. 


Solution: Its characteristic equation is 
p(d) = A* +207 4+1=0. 


By inspection, the roots of p(A) = 0 are A = 1,7, —7i,—i. So, the linearly independent solutions are 
sin x, x sin x, cos x, x2 cos x and the solution space is 


{ci sing + cocosx + causing + c4aucosx : C1,C2,C3,c4 € R}. 
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From the above discussion, it is clear that the linear homogeneous equation (8.6.2), admits n lin- 
early independent solutions since the algebraic equation p(A) = 0 has exactly n roots (counting with 
multiplicity). 


Definition 8.6.6 (General Solution) Let y;,y2,..., Ym be any set of m linearly independent solution of 
(8.6.2). Then 
Y= C1y1 + Coy2 +++ + CnYn 


is called a general solution of (8.6.2), where c,,Co,...,Cp are arbitrary real constants. 


Example 8.6.7 — 1. Find the general solution of y’” = 0. 
Solution: Note that 0 is the repeated root of the characteristic equation \° = 0. So, the general 
solution is 


Y= C1 + Cou + C32". 
2. Find the general solution of 


yo ty" +y' +y=0. 


Solution: Note that the roots of the characteristic equation \° + \? + A+ 1=0 are —1,%, —i. So, 
the general solution is 


y=cye *+cosinx + c3 cos. 


Exercise 8.6.8 — 1. Find the general solution of the following differential equations: 


(a) yy" +y' =0. 
(b) y’” + 5y’ — by = 0. 


(c) y+ 2y"” + y = 0. 


2. Find a linear differential equation with constant coefficients and of order 3 which admits the following 
solutions: 
(a) cosz,sina and e~%*. 
(b) e®,e?* and ce”. 
(c) 1,e” and a. 


3. Solve the following IVPs: 


(a) y’” —y=0, y(0) =0,y/(0) =0,y"(0) = 0, y/”(0) = 1. 
(b) 2y'” +4" + 2y’+y=0, y(0) =0,y/(0) =1,y"(0) =0. 


4. Euler Cauchy Equations: 


Let ag, @1,...,@n—1 € R be given constants. The equation 
n my gad y 
x aa + Gn—12 es +---+aoy=0, rel (8.6.4) 


is called the homogeneous Euler-Cauchy Equation (or just Euler's Equation) of degree n. (8.6.4) is also 
called the standard form of the Euler equation. We define 
d” , d’—ty 


n@y n— 
hijag 2 oe ee 
(y) = dx” anes dxr-1 
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Then substituting y = a>, we get 
L(a*) = (A(A = 1)---(A— n+ 1) + an 1A(A — 1)---(A=n+2)+++++a9)a%. 

So, x is a solution of (8.6.4), if and only if 

MA = 1)-+- (A= 241) + an-1A(A — 1) ++ (A= n+ 2) +--+ + a9 = 0. (8.6.5) 
Essentially, for finding the solutions of (8.6.4), we need to find the roots of (8.6.5), which is a polynomial 
in A. With the above understanding, solve the following homogeneous Euler equations: 

(a) x3y!" + 327y" + 2ry’ = 0. 

(b) 23y’" — 6x?y" + 1lay’ — 6y =0. 


(c) ay!” — 2?y"” + ay’ -—y=0. 
For an alternative method of solving (8.6.4), see the next exercise. 
5. Consider the Euler equation with x > 0 and aw € J. Let x = e' or equivalently ¢ = Inz. Let 
D=4 andd= 4. Then 
(a) show that rd(y) = Dy(t), or equivalently ooh - a 
(b) using mathematical induction, show that 2"d"y = (D(D —1)---(D—n+1))y(t). 
(c) with the new (independent) variable ¢, the Euler equation (8.6.4) reduces to an equation with 


na” 


constant coefficients. So, the questions in the above part can be solved by the method just 
explained. 


We turn our attention toward the non-homogeneous equation (8.6.1). If yp is any solution of (8.6.1) 
and if yp is the general solution of the corresponding homogeneous equation (8.6.2), then 


Y=Ynt Yp 


is a solution of (8.6.1). The solution y involves n arbitrary constants. Such a solution is called the 
GENERAL SOLUTION of (8.6.1). 

Solving an equation of the form (8.6.1) usually means to find a general solution of (8.6.1). The 
solution y, is called a PARTICULAR SOLUTION which may not involve any arbitrary constants. Solving 
(8.6.1) essentially involves two steps (as we had seen in detail in Section[8.3). 

Step 1: a) Calculation of the homogeneous solution y, and 
b) Calculation of the particular solution yp. 

In the ensuing discussion, we describe the method of undetermined coefficients to determine y,. Note 
that a particular solution is not unique. In fact, if y, is a solution of (8.6.1) and uw is any solution of 
(8.6.2), then y, + u is also a solution of (8.6.1). The undetermined coefficients method is applicable for 
equations (8.6.1). 


8.7 Method of Undetermined Coefficients 


In the previous section, we have seen than a general solution of 
Ln(y) = f(z) on I (8.7.6) 
can be written in the form 
Y=Yrar Up» 
where y;, is a general solution of L,(y) = 0 and yp is a particular solution of (8.7.6). In view of this, in 


this section, we shall attempt to obtain y, for (8-7.6) using the method of undetermined coefficients in 
the following particular cases of f (x); 
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1. f(x) =ke®*; k#0,a a real constant 
2. f(x) = e% (ky cos(6x) + ke sin(8x)); ky, ko,0,8 ER 
3. f(a) =2™. 


Case I. f(x) = ke®*; k #0,q@ a real constant. 
We first assume that a is not a root of the characteristic equation, t.e., p(a) 4 0. Note that L,(e**) = 
p(a)e**. Therefore, let us assume that a particular solution is of the form 


Yp = Ae™, 
where A, an unknown, is an undetermined coefficient. Thus 
Ln(Yp) = Ap(aje™. 


k 
Since p(a) 4 0, we can choose A = —— to obtain 


p(a) 
Ln(Yp) = ke**. 
k ars: : . 
Thus, yp = ——~e™ is a particular solution of L,(y) = ke”. 
a 
Modification Rule: If a@ is a root of the characteristic equation, i.e., p(a) = 0, with multiplicity r, 
(i.e., p(a) = p'(a) =--- = p"-Y(a) = 0 and p (a) 4 0) then we take, y, of the form 


and obtain the value of A by substituting y, in L,(y) = ke. 


Example 8.7.1 1. Find a particular solution of 


y” — 4y = 2e”. 


Solution: Here f(x) = 2e” with k = 2 and a = 1. Also, the characteristic polynomial, p(A) = \? — 4. 
Note that a = 1 is not a root of p(A) = 0. Thus, we assume y, = Ae”. This on substitution gives 


Ae” — 4Ae” = 2e” => —3Ae” = 2e”. 
—2 
So, we choose A = me which gives a particular solution as 


—2e” 
Yp = 3 


2. Find a particular solution of 
yl” = 3y" 4 3y’ —y =2¢". 


Solution: The characteristic polynomial is p(\) = A? — 3A? + 3A — 1 = (A— 1)? and a = 1. Clearly, 
p(1) = 0 and \ = a= 1 has multiplicity r = 3. Thus, we assume y, = Axe”. Substituting it in the 
given equation,we have 


Ae” (@ + 9x? + 182 4 6) 3.Ae” (= +627 + 6x) 
+ 3Ae” (ie + 327) — Arre® = 2c”. 
xe 


1 
Solving for A, we get A = 3° and thus a particular solution is y, = ar 
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3. Find a particular solution of 


Solution: The characteristic polynomial is p(A) = A? — \ and a = 2. Thus, using yp, = Ae*”, we get 
22x 


: : dh ticul lution i 
=— 7 = 7, an ence a particular SOIUutION IS =. 


4. Solve y” — 3y” + 3y’ — y = 2c”. 
Exercise 8.7.2 Find a particular solution for the following differential equations: 
1. y” — 3y' + 2y =e”. 
2. y” — 9y = e8* 
3. yl” — 8y" + 6y' — 4y = e?” 


Case II. f(x) = e* (ki cos(Bx) + kz sin(Bz)); ki,k2,a,8ER 
We first assume that a + 23 is not a root of the characteristic equation, i.e., p(a +73) 4 0. Here, we 


assume that yp is of the form 
Yp = e°* (Acos(Bx) + Bsin(Bz)), 


and then comparing the coefficients of e°” cosa and e°* sinx (why!) in L,(y) = f(x), obtain the values 
of A and B. 
Modification Rule: If a+ is a root of the characteristic equation, i.e., p(a+i3) = 0, with multiplicity 


r, then we assume a particular solution as 
Yp = xe" (Acos(8x) + Bsin(Bz)), 
and then comparing the coefficients in L,(y) = f(x), obtain the values of A and B. 
Example 8.7.3 1. Find a particular solution of 
y” + 2y' + 2y = 4e” sin x. 


Solution: Here, a = 1 and 6 = 1. Thus a+728 = 1 +7, which is not a root of the characteristic 
equation p(A) = A? + 2A + 2 = 0. Note that the roots of p(\) = 0 are -1 +i. 


Thus, let us assume y, = e® (Asinz + Bcosx). This gives us 
(—4B + 4A)e” sina + (4B + 4A)e” cosa = 4e” sina. 


Comparing the coefficients of e” cosx and e” ae on both sides, we gett A— B=1 and A+B=0. 


On solving for A and B, we get A = —B = -. So, a particular solution is y, = a (sina — cos2). 


2. Find a particular solution of 
y’ +y =sing. 


Solution: Here, a = 0 and G@ = 1. Thus a+728 = 1, which is a root with multiplicity r = 1, of the 
characteristic equation p(A) = A? +1=0. 
So, let yp = x(Acosxz + Bsinz) . Substituting this in the given equation and comparing the coefficients 


an 
of cosx and sinz on both sides, we get B = 0 and A = ae Thus, a particular solution is yp = 


—ZXCOS 2. 
2 
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Exercise 8.7.4 Find a particular solution for the following differential equations: 
Ly” —y"+y' —y=e* cosa. 
2. yo!" + 2y" +y =sine. 
3. y” — 2y’ + 2y = e* cosa. 


Case III. f(x) =2™ 
Suppose p(0) 4 0. Then we assume that 


Yp = Ama” + Ane +++++ Ao 


and then compare the coefficient of x* in L,(yp) = f(x) to obtain the values of A; for 0 <i < m. 
Modification Rule: If \ = 0 is a root of the characteristic equation, i.e., p(0) = 0, with multiplicity r, 


then we assume a particular solution as 
Yp = x" (Ana + Ame t peep Ao) 
and then compare the coefficient of x* in L,(yp) = f(x) to obtain the values of A; for 0 <i < m. 
Example 8.7.5 Find a particular solution of 
yt af ty —y = 2". 


Solution: As p(0) 4 0, we assume 
Yo = Aox? + Aix + Ao 


which on substitution in the given differential equation gives 


—2Ao + (2Aox + Aj) (Aga? t Aix t Ao) = x. 


Comparing the coefficients of different powers of x and solving, we get Ag = —1, Ai = —2 and Ap = 0. 
Thus, a particular solution is 
Up = —(x? + 22). 


Finally, note that if y,, is a particular solution of L,(y) = fi(z) and yp, is a particular solution of 
Ln(y) = fo(x), then a particular solution of 


Ln(y) = ki f(x) + ke fo(x) 


is given by 
Yp = kiyp, + k2Ypo- 


In view of this, one can use method of undetermined coefficients for the cases, where f(x) is a linear 


combination of the functions described above. 


Example 8.7.6 Find a particular soltution of 


y” +y =2sinz + sin 22. 


Solution: We can divide the problem into two problems: 


1. yy’ +y =2sing. 
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2. y"+y =sin2z. 
: : : : —l 
For the first problem, a particular solution (Example [8.7.3]2) is yp, = 2 3° cos x = —£ COS a. 


-1 
For the second problem, one can check that yp, = = sin(2z) is a particular solution. 


Thus, a particular solution of the given problem is 
Yp1 + Yp2 = —@ COS x — : sin(22). 
Exercise 8.7.7 Find a particular solution for the following differential equations: 
1. yy!” —y" +y' —y =5e* cosx + 10e?*. 
2. y+ 2y’+y=r2+e%. 


3. y” + 8y’ — 4y = 4e7 + e**. 


4. y"” + 9y = cosa t+ x? 4+ 2°. 


5. oy!” — 3y" +4y! = x? + e2” sin x. 


6. yy!” 4+ Ay!” + 6y” + 4y’ + Sy = 2sine + 2. 


Chapter 9 


Solutions Based on Power Series 


9.1 Introduction 


In the previous chapter, we had a discussion on the methods of solving 
y" + ay’ + by = f(x); 


where a,b were real numbers and f was a real valued continuous function. We also looked at Euler 
Equations which can be reduced to the above form. The natural question is: 
what if a and 6 are functions of x? 

In this chapter, we have a partial answer to the above question. In general, there are no methods of 


finding a solution of an equation of the form 
y" + q(a)y! +r(a)jy = f(x), vel 


where q(x) and r(x) are real valued continuous functions defined on an interval J C R. In such a 
situation, we look for a class of functions g(x) and r(x) for which we may be able to solve. One such 


class of functions is called the set of analytic functions. 


Definition 9.1.1 (Power Series) Let x) € R and ao, @1,...,G@n,... € R be fixed. An expression of the type 
S > @n(a — 20)” (9.1.1) 
n=0 


is called a power series in x around xo. The point zo is called the center, and a,,'s are called the coefficients. 


In short, a@o,@1,..-,@n,.-. are called the coefficient of the power series and x9 is called the center. 
Note here that a, € R is the coefficient of (~ — x9)” and that the power series converges for x = 2. So, 
the set 

00 
S={ceR: S- Gn (x — %)” converges} 
n=0 


is anon-empty. It turns out that the set S is an interval in R. We are thus led to the following definition. 


Example 9.1.2 1. Consider the power series 


gn ye! gk 
o> gh Yar ae 
: : (—1)” 
In this case, x9 = 0 is the center, a9 =O and ag, = 0 for n > 1. Also, aon4, = ————, n= 
(2n + 1)! 
1,2,.... Recall that the Taylor series expansion around xg = O of sing is same as the above power 


series. 
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2. Any polynomial 


ag + a,x + agx* +++++ay2” 


iS a power series with xo = 0 as the center, and the coefficients a,,, = 0 form>n-+1. 


Definition 9.1.3 (Radius of Convergence) A real number R > 0 is called the radius of convergence of the 
power series (9.1.1), if the expression (9.1.1) converges for all x satisfying 


|v —a9| < R and R is the largest such number. 


From what has been said earlier, it is clear that the set of points x where the power series is 
convergent is the interval (-R+ 20, vo + R), whenever R is the radius of convergence. If R = 0, the 
power series is convergent only at 7 = 2. 

Let R > 0 be the radius of convergence of the power series (9.1.1). Let J = (—R+ 20, v9 + R). In 
the interval J, the power series converges. Hence, it defines a real valued function and we denote 
it by f(x), @e., 

oo 
f(z) = S > @n(x - xo)”, El. 
n=1 
Such a function is well defined as long as x € I. ff is called the function defined by the power series 
(9.1.1) on I. Sometimes, we also use the terminology that (9.1.1) induces a function f on I. 

It is a natural question to ask how to find the radius of convergence of a power series (9.1.1). We 

state one such result below but we do not intend to give a proof. 


Co 
Theorem 9.1.4 = 1. Let 5° an(x—20)” be a power series with center 29. Then there exists a real number 
n=1 


R> 0 such that 
S- Gn (a — x9)” converges for all « € (-R+ 20,29 + R). 
n=1 


co 
In this case, the power series 5> a,(a— 29)” converges absolutely and uniformly on 
n=1 
je—ao| <r forall r<R 
and diverges for all 2 with 
|x — xo| > R. 


2. Suppose R is the radius of convergence of the power series (9.1.1). Suppose lim %/|a,| exists and 
n— oo 
equals @. 


1 
(a) If 240, then R= 7 
(b) If £ = 0, then the power series (9.1.1) converges for all x € R. 


: . ‘ . An+1 
Note that lim %/Ja,| exists if lim |—“) and 
n—0o n— +00] An 
A r An+1 
lim Vlan|= lim ; 
n—0o | n| n—0o An 


Remark 9.1.5 If the reader is familiar with the concept of limsup of a sequence, then we have a 
modification of the above theorem. 
In case, ~/|a,,| does not tend to a limit as n —+ ovo, then the above theorem holds if we replace 


lim */|an| by limsup */|ap|. 
— oo n—>oo 


n 
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Example 9.1.6 = 1. Consider the power series )> (a +1)". Here xp = —1 is the center and a, = 1 for all 
n=0 


n>0. So, V/lan| = V1 = 1. Hence, by Theorem the radius of convergenceR = 1. 


—1)r 1 2n+1 
2. Consider the power series S- eae 
n . 


n>0 


. In this case, the center is 


(=1)" 


Xp =—1, adn =0 for n even and aon41 = Gnd Dl 


So, 


lim 7"%/|aan41]=0 and lim *%/|aay,| = 0. 
n— oo 


nm—oo 


Thus, lim %/|a,,| exists and equals 0. Therefore, the power series converges for all « € IR. Note that 
n— co 


the series converges to sin(x + 1). 


co 
3. Consider the power series }~ x2”. In this case, we have 
n=1 


dan = 1 and agny1 = 0 forn =0,1,2,.... 


So, 


lim 7"*/|aangi] =O and lim *X/lag,| = 1. 
n— oo 


n—oo 


Thus, a */|an| does not exist. 


CO Co 
We let w = x”. Then the power series S> 2?” reduces to 5> uw”. But then from Example [9.1.6[1] we 


n=1 n=1 
Co 
learned that > wu” converges for all uw with |u| < 1. Therefore, the original power series converges 
n=1 
whenever || < 1 or equivalently whenever |x| < 1. So, the radius of convergence is R = 1. Note that 


1 7 2n 
7.2 = be: for |x| <i. 
1l-@¢ ar 


4. Consider the power series S- na”. In this case, %/|an| = Wn” =n. doesn’t have any finite limit as 
n>0 
nm — + co. Hence, the power series converges only for x = 0. 
2 
5. The power series S- — has coefficients a, = — and it is easily seen that lim 
= n! n! n—+00 
n> 


power series converges for all 2 € R. Recall that it represents e”. 


ne 


= 0 and the 


Definition 9.1.7 Let f : J —> R be a function and zo € I. f is called analytic around xo if there exists a 
6 > 0 such that 


f(z) = S- Qn(x— 29)" for every x with |x — xo| < 6. 
n>=0 


That is, f has a power series representation in a neighbourhood of xo. 


9.1.1 Properties of Power Series 


Now we quickly state some of the important properties of the power series. Consider two power series 


3 Gn(a— x9)" and > bn (a — xo)” 
n=0 n=0 
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with radius of convergence R; > 0 and Rz > 0, respectively. Let F(a) and G(x) be the functions defined 
by the two power series defined for all x € I, where J = (-R+ 20,20 + R) with R = min{R1, Ro}. Note 
that both the power series converge for all x € I. 

With F(a), G(x) and I as defined above, we have the following properties of the power series. 


1. EQUALITY OF POWER SERIES 
The two power series defined by F(x) and G(z) are equal for all a € I if and only if 


Gy, = b, for alln =0,1,2,.... 
In particular, if $> a,(#— 2)" = 0 for all x € J, then 
n=0 


ay, = 0 for all n=0,1,2,.... 


2. TERM BY TERM ADDITION 


For all x € I, we have 


F(a) + G(x) = S (an + by)(a — xo)” 


n=0 
Essentially, it says that in the common part of the regions of convergence, the two power series 
can be added term by term. 


3. MULTIPLICATION OF POWER SERIES 
Let us define 


Co = Gobo, and inductively c, = S- An—jb;. 
j=l 
Then for all x € I, the product of F(x) and G(z) is defined by 


Co 


H (ax) = F(a)G(a) = S- Cn(@ — Lo)”. 


n=0 
(a) is called the “Cauchy Product” of F(a) and G(z). 


Note that for any n > o, the coefficient of x” in 


Yate —2) (Som x — Xo) ; is ie mae 
j=l 


4. TERM BY TERM DIFFERENTIATION 
The term by term differentiation of the power series function F(z) is 


co 
S- NA (ax — Xo)” 
n=1 
Note that it also has R, as the radius of convergence as by Theorem lim lag = oo and 
n [oe) 


1 


lim V/|na,| = a V |n| i V |an| = 1- RE 
1 


nm—> co 


Let 0<r< R,. Then for all x € (—r+.20,2%9 + 1), we have 


d 
ar) = = Yo nal (a — @)” 


In other words, inside the region of convergence, the power series can be differentiated term by 


term. 
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In the following, we shall consider power series with rq = 0 as the center. Note that by a transfor- 
mation of X = x — 20, the center of the power series can be shifted to the origin. 


Exercise 9.1.1 1. which of the following represents a power series (with center xo indicated in the brack- 


ets) in x? 

(a) lta?tatt---+ a+... (xo = 0). 
(b) 1+sina + (sinz)? +---+(sinz)"+--- (ao = 0). 
(c) L4+ala|+a7|227)+---+a"a"|+--- (xo = 0). 


fa faa a gant 
TS Sap ap EY Gea 
2 at OR 
and g(x) = 1 op an vee ( "Gay te 


Find the radius of convergence of f(a) and g(x). Also, for each x in the domain of convergence, show 
that 


f'(@) =g(@) and g(x) =—f(2). 


[Hint: Use Properties 1,2,3 and 4 mentioned above. Also, note that we usually call f(x) by sina 
and g(x) by cosa./ 


3. Find the radius of convergence of the following series centerd at xp = —1. 


(b) 1+(@+1)+2(24+1)? +---+n(x@+1)"4+-:- 


9.2 Solutions in terms of Power Series 
Consider a linear second order equation of the type 
y” +a(x)y’ + b(x)y = 0. (9.2.1) 


Let a and b be analytic around the point zp) = 0. In such a case, we may hope to have a solution y in 
terms of a power series, say 


y= x cpe®. (9.2.2) 
k=0 


In the absence of any information, let us assume that (9.2.1) has a solution y represented by (9.2.2). We 
substitute (9.2.2) in Equation (9.2.1) and try to find the values of c;,’s. Let us take up an example for 
illustration. 


Example 9.2.1 Consider the differential equation 
y’+y=0 (9.2.3) 


Here a(x) = 0, b(x) = 1, which are analytic around x9 = 0. 
Solution: Let 


y= S- Cn&”. (9.2.4) 
n=0 
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co co 
Then y! = So nenx"—* and y” = SY n(n — 1)enx"~?. Substituting the expression for y, y’ and y” 
n=0 n=0 
Equation (9.2.3), we get 
foe) 
YS n(n Len 84 ent = 
n=0 
or, equivalently 
co foe) co 
0= So(n + 2)(n + L)enpou” + S- Cpe” = So{(n +1)(n + 2)en42 + en fa”. 
n=0 n=0 n=0 


Hence for all n = 0,1,2,..., 


Cn 


1 2)Cn n =0 n+2 = ——————.. 
(t#+1)(n+2)en4a+e Or Cn42 (nt lin +9) 


Therefore, we have 


c= FF, an. 
= (-1)229 = (“1° 3 
C4 (=1) s ou " 
Con = (—l)" Gap Can+1 = (—1)” Gat! 


Here, co and c, are arbitrary. So, 


y" re g2ntl 


gen 
yay Or * 1 Tr at 


or y = cocos(%) + cy sin(x) where cp and c, can be chosen arbitrarily. For co = 1 and c; = 0, we get 
y = cos(x). That is, cos(x) is a solution of the Equation (9.2.3). Similarly, y = sin(z) is also a solution of 


Equation (9.2.3). 


Exercise 9.2.2 Assuming that the solutions y of the following differential equations admit power series 
representation, find y in terms of a power series. 

1. y’ =—y, (center at xp = 0). 

2. y' =1+y’, (center at zp = 0). 


3. Find two linearly independent solutions of 


(a) y” — y = 0, (center at zo = 0). 
(b) y” + 4y = 0, (center at xp = 0). 


9.3 Statement of Frobenius Theorem for Regular (Ordinary) 
Point 


Earlier, we saw a few properties of a power series and some uses also. Presently, we inquire the question, 


namely, whether an equation of the form 
y" +a(x)y’ + b(a)y = f(x), wel (9.3.1) 


admits a solution y which has a power series representation around x € I. In other words, we are 
interested in looking into an existence of a power series solution of (9.3.1) under certain conditions on 
a(x), (x) and f(x). The following is one such result. We omit its proof. 
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Theorem 9.3.1 Let a(x),b(a) and f(x) admit a power series representation around a point x = 2 € TI, 
with non-zero radius of convergence 11,72 and rz, respectively. Let R = min{ri,r2,7r3}. Then the Equation 
(9.3.1) has a solution y which has a power series representation around xg with radius of convergence R. 


Remark 9.3.2 We remind the readers that Theorem[9.3.1] is true for Equations (9.3.1), whenever the 
coefficient of y” is 1. 

Secondly, a point xo is called an ORDINARY POINT for (9.3.1) if a(x),b(a) and f(a) admit power 
series expansion (with non-zero radius of convergence) around £ = %p. Xo is called a SINGULAR POINT 
for (9.3.1) if xo is not an ordinary point for (9.3.1). 


The following are some examples for illustration of the utility of Theorem |9.3.1 


Exercise 9.3.3 1. Examine whether the given point x is an ordinary point or a singular point for the 
following differential equations. 


(a) (w@—1)y” +sinzy = 0, xo = 0. 

(b) y” + S28y = 0, zo = 0. 

(c) Find two linearly independent solutions of 

(d) (1 — 2?)y” — 2ay’ + n(n + 1)y = 0, zo = 0, n is a real constant. 


2. Show that the following equations admit power series solutions around a given 9. Also, find the power 


series solutions if it exists. 
(a) y” +y=0, xo =0. 
(b) zy” +y=0, x =0. 
(c) y’+9y =0, x = 0. 


9.4 Legendre Equations and Legendre Polynomials 


9.4.1 Introduction 
Legendre Equation plays a vital role in many problems of mathematical Physics and in the theory of 


quadratures (as applied to Numerical Integration). 


Definition 9.4.1 The equation 
(1—27)y” — 2ay'+ p(pt+1)y=0, -l<a<1 (9.4.1) 
where p € R, is called a LEGENDRE EQUATION of order p. 


Equation (9.4.1) was studied by Legendre and hence the name Legendre Equation. 
Equation (9.4.1) may be rewritten as 
" 2a p(p + 1) 


ee ooo — 
Yaa) ta-a)" 


p(p + 1) 


d 
=r 1— 2x? 


with center at xo = 0 and with R = 1 as the radius of convergence). By Theorem[9.3.1] a solution y of 


are analytic around xo = 0 (since they have power series expressions 


2 
The functions i = 5 


(9.4.1) admits a power series solution (with center at v7) = 0) with radius of convergence R = 1. Let us 
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oo 


assume that y = >> apa” is a solution of (9.4.1). We have to find the value of a,’s. Substituting the 
k=0 
expression for 


y= is ka,z™—' and y" = S- k(k — 1)a,a*-? 
k=0 k=0 


in Equation (9.4.1), we get 


3 {(k+1)(k + 2)ax42 + a4(p — k)\(ptkt+}a* =0. 


k=0 
Hence, for k = 0,1,2,... 
iis 
— (k+1(k+2) ™ 
It now follows that 
ag = — Pet) a, fie -Upr?) g,, 
Gs == @2)et8) ao as = (—1)? Ye Sprayer) 


= (-1)? Pip?) (pret) ao, 


etc. In general, 


mP(P — 2)+-~(p = 2m +2)(p+ I(p+ 3)+++(p +2m— 1) 


dam = (—1) (2m)! 


ao 


on Cee aCe Cee re 


(2m + 1)! 


It turns out that both ap and a, are arbitrary. So, by choosing a9 = 1,a; = 0 and ap = 0, a; = 1 in the 


a2m+1 = (-1)™ a4. 


above expressions, we have the following two solutions of the Legendre Equation (9.4.1), namely, 


pede rer. oe ape ae Bias (9.4.2) 
and 
Pe a sen ve Le. ee (noe a ame Beth. (9.4.3) 


Remark 9.4.2 y; and y2 are two linearly independent solutions of the Legendre Equation (9.4.1). It 
now follows that the general solution of (9.4.1) is 


Y= C1y1 + C2Y2 (9.4.4) 


where c, and cy are arbitrary real numbers. 


9.4.2 Legendre Polynomials 


In many problems, the real number p, appearing in the Legendre Equation (9.4.1), is a non-negative 
integer. Suppose p = n is a non-negative integer. Recall 


(n—k)(n+k+1) 


=— h=00, 1,2 e303 9.4.5 
Ak+2 (k+1)(k +2) ak, 9454) ( ) 
Therefore, when k = n, we get 
On42 = An44 =*++* =Ant2m =--: =0 for all positive integer m. 


Case 1: Let n be a positive even integer. Then y; in Equation (9.4.2) is a polynomial of degree n. In fact, 


yi is an even polynomial in the sense that the terms of y; are even powers of x and hence y1(—2x) = yi (2). 
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Case 2: Now, let n be a positive odd integer. Then y2(x) in Equation (9.4.3) is a polynomial of degree 
n. In this case, y2 is an odd polynomial in the sense that the terms of y2 are odd powers of x and hence 
y2(—x) = —yo(a). 

In either case, we have a polynomial solution for Equation (9.4.1). 


Definition 9.4.3 A polynomial solution P,(ax) of (9.4.1) is called a LEGENDRE POLYNOMIAL whenever 
P,(1) =1. 


Fix a positive integer n and consider P, (x) = a9 + a1” +--:+a,2". Then it can be checked that 
P,(1) = 1 if we choose 
(Qn)! 1-3-5-+-(2n—1) 
~ 2(n!)2 n! 


Using the recurrence relation, we have 


a __(n=1)n) (2n — 2)! 


2Qn—-1)"  2"(n—1)\(n—2)! 
by the choice of a,. In general, if n — 2m > 0, then 


(2n — 2m)! 


an—-2m = ey an aa ee 


Hence, 


M 
m (2n = 2m)! n-2m 
XV 22m! (n — m)\(n — Im) ; eae 


-—1 
where M = 5 when n is even and M = — when n is odd. 


Proposition 9.4.4 Let p = n be a non-negative even integer. Then any polynomial solution y of 
which has only even powers of x is a multiple of P,,(x). 

Similarly, if p = 7 is a non-negative odd integer, then any polynomial solution y of which has only 
odd powers of x is a multiple of P,,(x). 


PROOF. Suppose that n is a non-negative even integer. Let y be a polynomial solution of (9.4.1). By 


(9.4.4) 


Y = C1Y1 + CoYe2, 


where y; is a polynomial of degree n (with even powers of x) and y2 is a power series solution with odd 
powers only. Since y is a polynomial, we have cp = 0 or y = cyy1 with c, 4 0. 
Similarly, P, (a) = ciy1 with ci 4 0. which implies that y is a multiple of P,,(a). A similar proof holds 


when n is an odd positive integer. 


We have an alternate way of evaluating P,,(a). They are used later for the orthogonality properties 
of the Legendre polynomials, P,,(x)’s. 


Theorem 9.4.5 (Rodrigués Formula) The Legendre polynomials P,,(x) for n = 1,2,..., are given by 
il d”™ 


~ Onn! dan 


Py (2) a1 (9.4.7) 
Proor. Let V(x) = (x? — 1)". Then 4V (zx) = 2na(a? — 1)?! or 


(a? — NLV(2) = 2na(x? — 1)" = InzV(z). 
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Now differentiating (n + 1) times (by the use of the Leibniz rule for differentiation), we get 


ens ee 2n(n +1) d” 
> a ees Oe AR Meet) Oe 
(x 1) a V(a) + (n+ 1)e aad V(a) + i. aon V(x) 
qvtt qd” 
By denoting, U(x) = #—V(z), we have 
(a? —1)U" + U'{2(n + 1)x — 2nz} + U{n(n + 1) —2n(n+1)} = 0 
or (l—2?)U" —22U'+n(n+1)U = 0. 


This tells us that U(x) is a solution of the Legendre Equation (9.4.1). So, by Proposition[9.4.4] we have 
d” 


P,,(#) = aU (a) = as (2° —1)" forsome a€R. 
x 
Also, let us note that 
d” d” 
S_(y? 1)" = = {(2-1 1)}" 
sal -1 = Sof(e- e+} 


= m(a+1)"+ terms containing a factor of (a — 1). 


Therefore, 
d” 
— ( 2— 1)" = 2"n! or, equivalently 
dx na 
1 d” 4 
— —(a*— 1)” = J 
2"n! da” e ) n=l 
and thus 1 @ 
Pfs) = ———__ 7 = 1)" 
(7) 2n! da” ( ) 


Example 9.4.6 1. Whenn=0, Po(x) = 1. 


ld 
2. When n = 1, Py (x) = =—(x? —1) =z. 
2 dx 
de Poe ee De ae a 


One may observe that the Rodrigués formula is very useful in the computation of P,,(x) for “small” values 
of n. 


Theorem 9.4.7 Let P,,(a) denote, as usual, the Legendre Polynomial of degree n. Then 
1 
/ P,(2)Pm(x) dx =0 if mn. (9.4.8) 
-1 


Proor. We know that the polynomials P,(x) and P,,(x) satisfy 


((1—a?)Pi(x))'+n(n+1)Pa(z) = 0 and (9.4.9) 
(1 —2?)P1(x))’ +m(m+1)Pn(z) = 0. (9.4.10) 


Multiplying Equation (9.4.9) by P,,(x) and Equation (9.4.10) by P,,(x) and subtracting, we get 


/ 


(n(n + 1) — m(m + 1))Pr(x) P(x) = ((1 — 2?) PY, (x)) Pr(x) — (1 — 2?) Pt (2))' Pm (2). 
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Therefore, 
1 
(n(n+1) — m(m+ ») | P, (2) Pr (x)dax 
-1 
‘ / / 
= / ((a — x?)P) ()) Pa(a) — (1-2?) P(x) Pm (x) ) dx 
-1 
1 w=1 
= ~ | (-2)P,@)Pia)de + (1 2°)Pa(z) Palo) 
=1 z=—1 
1 c=1 
+ f (= 2°)Pi(2)Ph(a)de + (1-2) Pi(@)Prn(2) 
=I e=—-1 
= 0 
Since n £m, n(n+1) 4 m(m +1) and therefore, we have 
[. Pi )dx =0 if m#n. 
Theorem 9.4.8 For n = 0,1,2,... 
2 
Pie AAI 
fom ~ n+l +1 t ) 
Proor. Let us write V(x) = (2? — 1)". By the Rodrigue’s formula, we have 
t \* 8 a 
2 ——— —— —_ 
iB Pe) de= [. (— =) SV (eV aya. 
1 r, 
Let us call [ = / — —V (x)dz. Note that for0 <m<n, 
nad 
“1 
d™ mm 
fia" Va a0 (9.4.12) 
dx™ daz™ 


Therefore, integrating J by parts and using (9.4.12) at each step, we get 


1 q2” 7 ‘ 
t= 7 gin V (*) -(-1)"V(z)dz = nyt 


(1 — 2*)"dx = (2n)! 2 | (1 — 2)"dzx. 


T 


2 
Now substitute « = cos@ and use the value of the integral i sin?” 6 dO, to get the required result. 
0 


We now state an important expansion theorem. The proof is beyond the scope of this book. 


Theorem 9.4.9 Let f(a) be a real valued continuous function defined in [—1, 1]. Then 


= x OnPr(xz), x € [-1,]] 


n=0 


iL 


i fea) Be (ace 


-1 


2n+1 
where an = 


Legendre polynomials can also be generated by a suitable function. To do that, we state the following 
result without proof. 
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Theorem 9.4.10 Let P,,(x) be the Legendre polynomial of degree n. Then 


Pi(x)t", t#1. 9.4.13 
V1 —2at + V1—2at + > ( ) 
: 1 : ; ee. 
The function h(t) = Ae admits a power series expansion in ¢ (for small t) and the 
V1— 2at+ 


coefficient of ¢” in P,(x). The function h(t) is called the GENERATING FUNCTION for the Legendre 
polynomials. 


Exercise 9.4.11 1. By using the Rodrigue’s formula, find Po(x), Pi(a) and P (x). 
2. Use the generating function (9.4.13) 


(a) to find Po(a), Pi(x) and P2(x). 


(b) to show that P,,(a) is an odd function whenever n is odd and is an even function whenevern is 


even. 


Using the generating function (9.4.13), we can establish the following relations: 


(1 +1)Proi(t) = (2n+1) a P(x) —n Pr_-i(2) (9.4.14) 
nPa(a). = «@Plia)— Pla) (9.4.15) 
Plij(z) = «Pi (x)+(n+1)P,(2). (9.4.16) 


The relations (9.4.14), and are called recurrence relations for the Legendre polyno- 
mials, P,,(#). The relation is also known as Bonnet’s recurrence relation. We will now give the 
proof of (9.4.14) using (9.4.13). The readers are required to proof the other two recurrence relations. 

Differentiating the generating function (9.4.13) with respect to t (keeping the variable x fixed), we 
get 


1 
5 (1 — Qt +2?) # ( —2x + 2t) = Donal ee", 


Or equivalently, 


(x — t)(1 — 2xt + t?)~2 = (1 — 2at + ¢?) Suet ire, 
n=0 


We now substitute > P,()t” in the left hand side for (1 — 2xt + t?)~2, to get 


n=0 
(w—t) S> P,(x)t” = (1 — 2at + 2”) OC aaa 
n=0 n=0 


The two sides and power series in t and therefore, comparing the coefficient of t”, we get 
@Pp (x) — Py—1(@) = (n+ 1)Pp(x) + (n — 1)Ph_-1(@) — 2n & P(x). 


This is clearly same as (9.4.14). 

To prove (9.4.15), one needs to differentiate the generating function with respect to x (keeping t 
fixed) and doing a similar simplification. Now, use the relations (9.4.14) and (9.4.15) to get the relation 
(9.4.16). These relations will be helpful in solving the problems given below. 


Exercise 9.4.12 1. Find a polynomial solution y(x) of (1—2?)y” — 2ry’ + 20y = 0 such that y(1) = 10. 


2. Prove the following: 


9.4. LEGENDRE EQUATIONS AND LEGENDRE POLYNOMIALS 


1 
(a) { Pn(x)dx = 0 for all positive integers m > 1. 
-1 


1 
(b) f x?”*! P2,,(x)dx = 0 whenever m and n are positive integers with m 4 n. 
-1 


1 
(c) f v™P,(x)daz = 0 whenever m and n are positive integers with m <n. 
-1 


n(n +1) 


3. Show that P’(1) = — and P!(—1) = (-1)""2 n(n + 1) 


2 


4. Establish the following recurrence relations. 


(a) (n+ 1)Pa(@) = Pryi(e) — £P, (2). 
(b) (1 — 2?) Pi (x) = n[Pa_1(x) — ePa(z)]. 
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Part III 


Laplace Transform 


Chapter 10 


Laplace Transform 


10.1 Introduction 


In many problems, a function f(t), t € [a, b] is transformed to another function F'(s) through a relation 
of the type: 


b 
F(s) = / K(t,s)f(t)at 


where K(t,s) is a known function. Here, F'(s) is called integral transform of f(t). Thus, an integral 
transform sends a given function f(¢) into another function F(s). This transformation of f(t) into F'(s) 
provides a method to tackle a problem more readily. In some cases, it affords solutions to otherwise 
difficult problems. In view of this, the integral transforms find numerous applications in engineering 
problems. Laplace transform is a particular case of integral transform (where f(t) is defined on [0, 00) 
and K(s,t) = e~**). As we will see in the following, application of Laplace transform reduces a linear 
differential equation with constant coefficients to an algebraic equation, which can be solved by algebraic 
methods. Thus, it provides a powerful tool to solve differential equations. 

It is important to note here that there is some sort of analogy with what we had learnt during the 
study of logarithms in school. That is, to multiply two numbers, we first calculate their logarithms, add 
them and then use the table of antilogarithm to get back the original product. In a similar way, we first 
transform the problem that was posed as a function of f(t) to a problem in F'(s), make some calculations 
and then use the table of inverse Laplace transform to get the solution of the actual problem. 

In this chapter, we shall see same properties of Laplace transform and its applications in solving 
differential equations. 


10.2 Definitions and Examples 


Definition 10.2.1 (Piece-wise Continuous Function) — 1. A function f(t) is said to be a piece-wise con- 
tinuous function on a closed interval [a,b] C R, if there exists finite number of points a = to < ty < 
tg <-++-<ty = 6 such that f(t) is continuous in each of the intervals (t;-1, t;) for 1 <i < N and 
has finite limits as t approaches the end points, see the Figure [10.1 


2. A function f(t) is said to be a piece-wise continuous function for t > 0, if f(t) is a piece-wise continuous 
function on every closed interval [a,b] C [0,00). For example, see Figure 


Definition 10.2.2 (Laplace Transform) Let f : [0,0o) —> R and s € R. Then Fs), for s € R is called 
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Figure 10.1: Piecewise Continuous Function 


the LAPLACE TRANSFORM of f(t), and is defined by 


L(f(t)) = F(s) = i ” F(ten stat 


whenever the integral exists. 


oe) b oe) b 
(Recall that f g(t)dt exists if lim ff g(t)d(t) exists and we define [ g(t)dt = lim f g(t)d(t).) 
0 b—+o0 9 0 b—+00 9 


Remark 10.2.3. 1. Let f(t) be an EXPONENTIALLY BOUNDED function, 1.e., 
|f(})| < Me™ for all t>0 and for some real numbers a and M with M > 0. 


Then the Laplace transform of f exists. 


2. Suppose Fs) exists for some function f. Then by definition, jim ie f(t)e~ “dt exists. Now, one 
— oo 


can use the theory of improper integrals to conclude that 


lim F(s) =0. 


s— oo 


Hence, a function F's) satisfying 


lim F(s) does not exist or lim F(s) £0, 
s—>oo s— oo 


cannot be a Laplace transform of a function f. 


Definition 10.2.4 (Inverse Laplace Transform) Let £(f(t)) = F(s). That is, F(s) is the Laplace trans- 
form of the function f(t). Then f(t) is called the inverse Laplace transform of F'(s). In that case, we write 


f() = £7*(F(s)). 
10.2.1. Examples 


Example 10.2.5 = 1. Find F(s) = L(f(t)), where f(t) =1, t>0. 
—st |b —sb 
1 € 


Solution: F(s) =| e “dt = lim 


0 b—co —S 0 Ss b—co =S 
Note that if s > 0, then 


Thus, 
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In the remaining part of this chapter, whenever the improper integral is calculated, we will not explicitly 


write the limiting process. However, the students are advised to provide the details. 


2. Find the Laplace transform F'(s) of f(t), where f(t) =t, t > 0. 
Solution: Integration by parts gives 


—t 
F(s) = fi te~*'dt = — 
0 S 


3. Find the Laplace transform of f(t) =t", na positive integer. 
Solution: Substituting st = 7, we get 


F(s) = / et" dt 
0 


1 CO 
= a : | e "Tt" dt 
get 
0 


nl 


= z= for s>0. 


4. Find the Laplace transform of f(t) =e, t > 0. 


Solution: We have 
L(e“) = / eter ae= | e (s-atde 
0 0 


1 
= for s>a. 
s—a 


5. Compute the Laplace transform of cos(at), t > 0. 
Solution: 


L(cos(at)) = [F cosatyer**at 


co co est 
-{ —asin(at) - dt 
0 0 


=—s 
at (* sin(at) e~*¢ 


co co —st 
cos(at) e~* 

= -| 92 coskat) ir) 
0 0 S —Ss 


est 


= cos(at) 
—s 


Ss S —s 
Note that the limits exist only when s > 0. Hence, 


a* + s? 
2 


Ss 
a? + s2’ 


s>0. 
s 


| salane Hee. Thus L(cos(at)) = 
0 Ss 


6. Similarly, one can show that 
: a 
£(sin(at)) = Pag?’ s>Q0. 


1 
7. Find the Laplace transform of f(t) = FR t>0. 


Solution: Note that f(t) is not a bounded function near ¢ = 0 (why!). We will still show that the 
Laplace transform of f(t) exists. 


1 cane Wepeeer: °° /s 
—) = —e “dt= — 
Ve o vet 0 VT 
1 
Ss 


dt 
e *— (substitute 7 = st) 
5 


0 
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Recall that for calculating the integral | 7? e ‘dr, one needs to consider the double integral 
0 


oo 5 2 1 oo 2 
- [oe (2?+y") dxdy = . e” i) a Gal r2 le dr . 
0 0 


It turns out that 


Pin Ne 
vi V8 


We now put the above discussed examples in tabular form as they constantly appear in applications 


Thus, £(—= for s > 0. 


of Laplace transform to differential equations. 


sin(at) | s—> cos(at) 


sinh(at) | —=—> cosh(at) 
—a 


Table 10.1: Laplace transform of some Elementary Functions 


10.3. Properties of Laplace Transform 


Lemma 10.3.1 (Linearity of Laplace Transform) 1. Let a,b € R. Then 


L(af(t)+bg(t)) = | *(af(t) + bg(t))e~**dt 


0 


aL(f(t)) + bL(g(t)). 


2. If F(s) = L(f(£)), and G(s) = L(g(£)), then 
Lo'(aF(s) + bG(s)) = af (t) + b(t). 


The above lemma is immediate from the definition of Laplace transform and the linearity of the 


definite integral. 


Example 10.3.2 — 1. Find the Laplace transform of cosh(at). 


at —at 
Solution: cosh(at) = —. Thus 


1 1 1 
£(cosh(at)) = 5 ( a ) = = s > |al. 


2. Similarly, 


e(sinh(at)) = 5 ( en )=ste Rea 
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Figure 10.2: f(t) 


3. Find the inverse Laplace transform of 


s(s +1) 
Solution: 
1 1 1 
LS = £i-- 
(Sea) i re 
1 1 
ae aC) oa =1l-e™. 
(4) 2-4) =1-6 
Thus, the inverse Laplace transform of : is f(t) =1—e* 
v —— i =l-e™. 
‘ s(s +1) i 


Theorem 10.3.3 (Scaling by a) Let f(t) be a piecewise continuous function with Laplace transform F'(s). 
1 

Then for a>0, L(f(at)) = —F(=). 
a ‘a 


Proor. By definition and the substitution z = at, we get 


L(f(at)) = [ etttanae=* [eet reac 
Lf e-* f(2\dz = Lee 
= of cBi@ae= iO). 


a a 


Exercise 10.3.4 = 1. Find the Laplace transform of 
t?+at+b, cos(wt+6), cos*t, sinh? t; 
where a,b, w and @ are arbitrary constants. 
2. Find the Laplace transform of the function f(-) given by the graphs in Figure [10.2] 


Paha shinee, 


3. LOGO) = Sq + eet 


The next theorem relates the Laplace transform of the function f’(t) with that of f(t). 


Theorem 10.3.5 (Laplace Transform of Differentiable Functions) Let f(t), fort > 0, be a differentiable 


function with the derivative, f’(t), being continuous. Suppose that there exist constants M and T such that 
| f (t)| < Me™ for all t > T. If L(f(t)) = F(s) then 


L(f'(t)) =sF(s)— f(0) for s>a. (10.3.1) 
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PRroor. Note that the condition |f(t)| < Me for all t > T implies that 


lim f(b)e~’ =0 for s>a. 


b—+oo 


So, by definition, 


 b—+00 
= tn soe] ~ ti, We FO (sera 
= —f(0)+sF(s ). 


We can extend the above result for n*4 derivative of a function FO. PDj gf VO FU 
exist and f(”)(t) is continuous for t > 0. In this case, a repeated use of Theorem [10.3.5] gives the 
following corollary. 


Corollary 10.3.6 Let f(t) be a function with L(f(t)) = F(s). If f/(t),..., f° Y(t), f(b) exist and 
f(t) is continuous for t > 0, then 


£(f)(t) = 8" F(s) — 8-1 f(0) — 8-2 f'(0) —--- — FON). (10.3.2) 
In particular, for n = 2, we have 
£(f"(t)) = s?F(s) — sf(0) — f’(0). (10.3.3) 
Corollary 10.3.7 Let f’(t) be a piecewise continuous function for ¢ > 0. Also, let f(0) =0. Then 
L(f'(t)) = sF(s) or equivalently £~'(sF(s)) = f’(t). 


Example 10.3.8 1. Find the inverse Laplace transform ors 


241° 
1 
Solution: We know that £~'(——) = sint. Then ent ) = 0 and therefore, £~'( ) = cost. 
st +1 s?+1 
2. Find the Laplace transform of f(t) = cos?(t). 
Solution: Note that f(0) =1 and f’(t) = —2cost sint = — sin(2t). Also, 


£(—sin(24)) = =. 


Now, using Theorem [10.3.5] we get 


L(f(t) = = (-s5 + 1) = aE 


Lemma 10.3.9 (Laplace Transform of ¢f(t)) Let f(t) be a piecewise continuous function with £(f(t)) = 
F'(s). If the function F'(s) is differentiable, then 


Equivalently, ££ F(s)) =tf(t). 
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Proor. By definition, F'(s) = / e *' f(t)dt. The result is obtained by differentiating both sides with 
0 


respect to s. 


Suppose we know the Laplace transform of a f(t) and we wish to find the Laplace transform of the 


t 
at ) Suppose that G(s) = L(g(t)) exists. Then writing f(t) = tg(t) gives 


function g(t) = = 


F(s) = £(F(t)) = L(ta(t)) = LG). 


Thus, G(s) = — f F(p)dp for some real number a. As tim G(s) = 0, we get G(s) = f F(p)dp. 


Hence,we have the following corollary. 


F(t) Then 


Corollary 10.3.10 Let £(f(t)) = F(s) and g(t) = a 


L(u(t)) = G(s) = f Fed 


Example 10.3.11 = 1. Find £(tsin(at)). 


Solution: We know £(sin(at)) = 2as 


Hence £(tsin(at)) = @+a2)?" 


a 
s2 + a2- 


2. Find the function f(t) such that F'(s) = 


(s— 1)" 


Solution: We know L(e*) = 


wais ~t (Gap) ~ 2a (a) 


By lemma we know that L(tf(t)) = —AF(s). Suppose 4F(s) = G(s). Then g(t) = 
L-'G(s) = £71 £F(s) = —tf(t). Therefore, 


ae (Sr) = (Za(s)) =—tgt) =" Fe). 
Thus we get f(t) = 2t7e’. 


Lemma 10.3.12 (Laplace Transform of an Integral) If f(s) = £(f(t)) then 


elf senr| = a 


Equivalently, £~* (=) — of (r)dr. 


PRooF. By definition, 


cf fe dr) -[ ee (f a ir) a= f° ff e~*! f(r) drat. 


We don’t go into the details of the proof of the change in the order of integration. We assume that the 
order of the integrations can be changed and therefore 


Te e~** f(r) arat= f° [ e~** f(r) dt dr. 
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#0) dr) = [ [ e f(r) drdt 


[ [tt arar= f° [oe —7)-87 F(r) dt dr 
= [ en8" f(r)dr tz — 
= [evteer ([ eae) = F(s)~. 


Example 10.3.13 1. Find L( ch sin(az)dz). 


Solution: We know L(sin(at)) = os Hence 
s2 + a? 
t 
: 1 a a 
cif sin(az)dz) = a : (s +02) = s(s? + a2) 


2. Find £ 


t 


7dr 


0 
Solution: By Lemma [i0.3.12] 


3. Find the function f(t) such that F'(s) = 


Solution: We know L(e*) = 


t L t? ! 
L (/ 4) £(#) t : il = 
0 Ss Ss S s 


1 
s—l 


c(- =) a (<5) a4 [ e'dr = 4-2) 


Lemma 10.3.14 (s-Shifting) Let C(f(t)) = F(s). Then L(e“ f(t)) = F(s —a) for s >a. 


PROOF. 


L(e™ f(t) —_ [ e™ f(t) e dt = [ f(t) (s—a )t dt 


F(s—a) s>a. 


Example 10.3.15 = 1. Find L(e% sin(dt)). 


Solution: We know L(sin(bt)) = 


2. Find em | 


. Hence L(e sin(bt)) = 


b b 
s? + b? (s— a)? +b?" 


s—5 


Solution: By s-Shifting, if £(f(t)) = F(s) then L(e™ f(t)) = F(s — a). Here, a = 5 and 


4 8 =i 8 = 
a (az) = (az) tent): 


Hence, f(t) = e* cos(6t). 
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10.3.1 Inverse Transforms of Rational Functions 


Let F(s) be a rational function of s. We give a few examples to explain the methods for calculating the 


inverse Laplace transform of F'(s). 


Example 10.3.16 1. DENOMINATOR OF F' HAS DISTINCT REAL ROOTS: 


— (s+1)(s+3) , 
3 1 35 


Soluti : F = —___———_ eee 
plunge) tga IB(aaeD\ * As@ ene) 


. Thus, 


2. DENOMINATOR OF F' HAS DISTINCT COMPLEX ROOTS: 


4s+3 


s+1 1 2 
Solution: F'(s) = 4————.——-. — — . —————— Thus, 
olution: F(s) (st 1)?+2 2 GHlP+2 us 
1 
f®H= de? cos(2t) — rial sin(2t). 
3. DENOMINATOR OF F' HAS REPEATED REAL ROOTS: 
3s+4 : 
If F(s) = ————.————~ ff t). 
(s) (s+ 1)(s? +48 +4) ind ete) 
Solution: Here, 
4 4 b 
a es yicec 


(st+1)(s2+4s+4) (st+i1)(s+2)2 stl s+2° (+2)? 


Solving for a,b and c, we get F(s) = a - oa + way — — - = + 24 (-~ty). Thus, 


f(t) =e7t — e77# + Qte~**. 


10.3.2 Transform of Unit Step Function 


Definition 10.3.17 (Unit Step Function) The Unit-Step function is defined by 


0 if 0O<t<a 
U,(t) = ei . 
(6) i if t>a 


Example 10.3.18 £(U,(t)) = pea = 


a 


e 8% 


,s>0. 


Lemma 10.3.19 (¢-Shifting) Let £(f(t)) = F(s). Define g(t) by 


0 if 0<t<a 
f(t—a) if t>a 


Then g(t) = Ua (t) f(t — a) and 
L(g(t)) =e *F(s). 
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f(t) g(t) 


d a d+a 


Figure 10.3: Graphs of f(t) and U,(t)f(t — a) 


ProoF. Let 0 <t¢<a. Then U,(t) =0 and so, Ua(t) f(t — a) =0 = g(t). 
If t > a, then U,(t) = 1 and U,(t) f(t — a) = f(t — a) = g(t). Since the functions g(t) and U,(t) f(t — a) 
take the same value for all t > 0, we have g(t) = Ua(t) f(t — a). Thus, 


co 


L(g(t)) = | e-*g(t)at = " e-** F(t — a)dt 


e 
s*—4s—5 


Example 10.3.20 Find £-! (a5) 


Solution: Let G(s) = —— =e °*F(s), with F(s) = ~—-=. Since s* — 4s — 5 = (s — 2)? — 3? 


LF) =i (5 re) — 5 sinh(3t)e" 


Hence, by Lemma|10.3.19 
1 
£-1(G(s)) = 5 Us(t) sinh (3(t — 5)) e-9. 


0 t<27 


Example 10.3.21 Find L(f(t)), where f(t) = tcost t>2 
Cos TT. 


Solution: Note that 


0 t< 27 
i= 
t — 27) cos(t — 27) + 2m cos(t — 27) t > Qn. 


2 
_. tbe9 [ Sood 8 
Thus, £(f(t)) =e (s FIP + LT z :) 


Note: To be filled by a graph 


10.4 Some Useful Results 


10.4.1 Limiting Theorems 


The following two theorems give us the behaviour of the function f(t) when t —> 0* and when t —> oo. 
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Theorem 10.4.1 (First Limit Theorem) Suppose £(f(t)) exists. Then 


lim f(t) = Tim sF(s). 


t—>0+ 


Proor. We know sF(s) — f(0) = £(f’(t)). Therefore 


lim sF(s) = f(0)+ lim * eos! p(t) dt 
s— oo s— oo 0 
= foo)+ lim e~* f’(t)dt = f(0). 
0 s— oo 


as lim e *' =0. 
sS— >0o 


Example 10.4.2. 1. For t > 0, let Y(s) = L(y(t)) = a(1 + s?)~1/?. Determine a such that y(0) = 1. 
Solution: Theorem [10.4.1] implies 


. as . a 
1= jim sY(s) = jim L(+ se = im (apr Thus, a = 1. 
1 
2 fF FE)= der Ae) find f(O*). 


s(s + 2)(s +8 
Solution: Theorem|10.4.1| implies 


f(0*) = lim sF(s) = lim s- E4OG+9) 


ee s(s +2)(s +8) =e 


On similar lines, one has the following theorem. But this theorem is valid only when f(t) is bounded 


as t approaches infinity. 


Theorem 10.4.3 (Second Limit Theorem) Suppose £(f(t)) exists. Then 


lim f(t) = lim sF(s) 


t—+oo 


provided that sf'(s) converges to a finite limit as s tends to 0. 
PROOF. 


lim sF(s) = f(0)+ lim 7 e* f'(t)dt 


s—>0 s—>0 Jo 
t 
_ 1 . —stT fl 
ge ee 
t 
= a . —st fl _ yh 
= f(0)+ lim lim e~*7 f'(r)dr = jim f(t). 


t—+oo 0 s— >0 


Example 10.4.4 If F(s) = s 
S\S Ss 


Solution: From Theorem|10.4.3] we have 


find Jim f(t). 


lim f(t)= lim sF(s) = lim s.- 2 Sue = 
t—300 s—0 s—30 s(s + 2)(s + 8) 16 8 


We now generalise the lemma on Laplace transform of an integral as convolution theorem. 


Definition 10.4.5 (Convolution of Functions) Let f(t) and g(t) be two smooth functions. The convolu- 


(fxg)(t = f sera (t—r)d 


tion, f xg, is a function defined by 


202 CHAPTER 10. LAPLACE TRANSFORM 


Check that 


1. (f *g9)(t) = 9 f(t). 


_t cos(t) + sin(t) 


2. If f(t) = cos(t) then (f x f)(t) 5 


Theorem 10.4.6 (Convolution Theorem) If f(s) = £(f(t)) and G(s) = L(g(t)) then 
L / f(r)g(t- nar] = F(s)- G(s). 


1 
Remark 10.4.7 Let g(t) = 1 for all t > 0. Then we know that L(g(t)) = G(s) = —. Thus, the 
8 
Convolution Theorem[{I0.4.6] reduces to the Integral Lemma[I0.3.1), 


10.5 Application to Differential Equations 


Consider the following example. 


Example 10.5.1 Solve the following Initial Value Problem: 


af" (t) + bf'(t) + ef (t) = g(t) with f(0) = fo, f'(0) = fi. 


and the initial conditions imply 


G(s) = (as? + bs + c)F(s) — (as + 6) fo — aft. 


Hence, 
G b 
r=  — aca (10.5.1) 
as? +bs+c as?+bs+c as?+bs+c 
—S a 
non—homogeneous part initial conditions 


Now, if we know that G(s) is a rational function of s then we can compute f(t) from F(s) by using the 
method of PARTIAL FRACTIONS (see Subsection [10.3.1 ). 


Example 10.5.2 1. Solve the IVP 


t if O<t<5 
4 i 5 = t = —_ 
Pe =I) tg if t>5 
with y(0) = 1 and y’/(0) = 4. 
Solution: Note that f(t) = t+ Us(t). Thus, 
1 «@* 
L(F(t)) = 5 


Taking Laplace transform of the above equation, we get 


(s’¥(s) — sy(0) — y'(0)) — 4(sY(s) — y(0)) — 5Y(s) = LF) = a+ 
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Which gives 
8 eg 1 


PGs) sGaDGss) ses le=n 


_ fA 5 4 1 ee oe 5 nm 1 
~ 6}/s-5 stl 30 s s+1 5-5 
a2 pee 25 a 1 
150 82 s s+1l s—5 
Hence, 


5edt e-t 1 e7 (t-5) er(t—5) 
ss es eee 
ut) = + 40st |-5+ +S 


1 
+—— [—30t + 24— 25e* +e"). 


150 


Remark 10.5.3 Even though f(t) is a DISCONTINUOUS function at t = 5, the solution y(t) and y’(t) 
are continuous functions of t, as y"” exists. In general, the following is always true: 
Let y(t) be a solution of ay” + by’+cy = f(t). Then both y(t) and y'(t) are continuous functions of time. 


Example 10.5.4 — 1. Consider the IVP ty”(t) + y/(t) + ty(t) = 0, with y(0) = 1 and y’(0) = 0. Find 
L(y(t)). 


Solution: Applying Laplace transform, we have 


d 
=a. [s’¥ (s) — sy(0) — y’(0)] + (sY(s) — y(0)) - —Y(s) =0. 
Using initial conditions, the above equation reduces to 


£ [(s? +1)Y(s) — s| — sY(s)+1=0. 


This equation after simplification can be rewritten as 


Ys). 8 
Y(s) s+] 


Therefore, Y(s) = a(1 + 8°)". From Example[10.4.2]1] we see that a = 1 and hence 


Y(s) = (1+s?)72. 
2. Show that y(t y= [90 f(r)g(t — r)dr is a solution of 


y"(t) + ay'(t) + by(t) = f(t), with y(0) = y/(0) =0; 


1 
here £ig(t)] = ~————. 
i) s?tas+b (3) 
: F(s 1 
Solution: Here, Y(s) = Pige set = F(s) . Page Hence, 


y(t) = (fF xg)lt = fre (t—7)d 


3. Show that y(t =i fr f(r) sin(a(t — 7))dr is a solution of 


y(t) + a7y(t) = f(t), with y(0) = y/(0) =0. 
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: _ #ls) 1 a 
Solution: Here, Y(s) = Pag a (Fe) : ao} . Hence, 


y(t) = “f(t * sin(at) = - | f(r) sin(a(t — r))dr. 
0 
4. Solve the following IVP. 


t 
y(t) =| y(r)dr +t—Asint, with y(0) =1. 
) 


Solution: Taking Laplace transform of both sides and using Theorem|10.3.5| we get 


Y(s) 1 1 
Y(s) -l= = ‘ 
ae) 8 8? s2+1 
Solving for Y(s), we get 
2_ 4] 1 1 
Y(s) = Saini 


s(s?+1) 8 ~s?4+1 
So, : 
y(t) =1-2 f sin(r)dr = 1+ 2(cost — 1) = 2cost — 1. 
0 


10.6 Transform of the Unit-Impulse Function 


Consider the following example. 


Example 10.6.1 Find the Laplace transform, D;,(s), of 


0 t<0 
br(t) = 4 ¢ OSt<A 
0 t>h. 
1 
Solution: Note that 6,(t) = 7, (old) — Uy (t)). By linearity of the Laplace transform, we get 
1,l1-e™ 
D = —(———_-}. 
na) = FEE) 


Remark 10.6.2 = 1. Observe that in Example[i0.6.1| if we allow h to approach 0, we obtain a new 
function, say 6(t). That is, let 
6(t) = lim 6,(t). 


h—0 
This new function is zero everywhere except at the origin. At origin, this function tends to infinity. 
In other words, the graph of the function appears as a line of infinite height at the origin. This 
new function, 6(t), is called the UNIT-IMPULSE FUNCTION (or Dirac’s delta function). 


2. We can also write 1 
= lim 6,(t)= lim = _- t)). 
oy) ee a(t) h—40 7, Colt) Un( )) 
3. In the strict mathematical sense Jim, dn(t) does not exist. Hence, mathematically speaking, 6(t) 
— 
does not represent a function. 


4. However, note that 
| On(t)dt =1, forall h. 
0 
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1 —hs 
5. Also, observe that £(6;,(t)) = — Now, if we take the limit of both sides, as h approaches 
iS 


zero (apply L’Hospital’s rule), we get 


£(6(t)) = lim ae 
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Part IV 


Numerical Applications 


Chapter 11 


Newton’s Interpolation Formulae 


11.1 Introduction 


In many practical situations, for a function y = f(a), which either may not be explicitly specified or 
may be difficult to handle, we often have a tabulated data (2;,y;), where y; = f(xi), and aj < Xi41 
for i = 0,1,2,...,N. In such cases, it may be required to represent or replace the given function by a 
simpler function, which coincides with the values of f at the N +1 tabular points x;. This process is 
known as INTERPOLATION. Interpolation is also used to estimate the value of the function at the non 
tabular points. Here, we shall consider only those functions which are sufficiently smooth, 7.e., they are 
differentiable sufficient number of times. Many of the interpolation methods, where the tabular points 
are equally spaced, use difference operators. Hence, in the following we introduce various difference 
operators and study their properties before looking at the interpolation methods. 

We shall assume here that the TABULAR POINTS 20, %1,%2,...,@N are equally spaced, i.e., x, — 
tp-1 = fh for each k = 1,2,...,N. The real number fh is called the STEP LENGTH. This gives us 
LE = Xo + kh. Further, yx = f(xe) gives the value of the function y = f(x) at the kth tabular point. 
The points y1, y2,..-,yn are known as NODES or NODAL VALUES. 


11.2 Difference Operator 


11.2.1 Forward Difference Operator 


Definition 11.2.1 (First Forward Difference Operator) We define the FORWARD DIFFERENCE OPERA- 
TOR, denoted by A, as 


Af(x) = fl@+h)— f(a). 


The expression f(x +h) — f(x) gives the FIRST FORWARD DIFFERENCE of f(a) and the operator A is 
called the FIRST FORWARD DIFFERENCE OPERATOR. Given the step size h, this formula uses the values 
at x and « +h, the point at the next step. As it is moving in the forward direction, it is called the 


forward difference operator. 


Backward 


<—_— 


| | | | | | 

T T T T T T | 

Xo xX] Xe-r Xk X kel Xn 
Rees 


Forward 


9NO 
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Definition 11.2.2 (Second Forward Difference Operator) The second forward difference operator, A?, is 
defined as 


A? f(x) = A(Af(z)) = Af(e +h) -— Af(z). 
We note that 


A? f(x) Af(a +h) — Af(a) 
(f(x + 2h) — f(x +h)) —(f(e+h)— f(a) 


f(a@+ 2h) —2f(a@+h) + f(z). 


In particular, for x = xz, we get, 
AY = Yeti — Ye 
and 


A? yx = Ayer — Aye = yrte — 2ye41 + Ye- 


Definition 11.2.3 (rth Forward Difference Operator) The rth forward difference operator, A", is defined 


as 
A’f(z) = AT 1f(x+h)-—A™ f(a), PS WD ewe. 
with A° f(x) = f(x). 


Exercise 11.2.4 Show that A®y;, = A?(Ay;,) = A(A?y,). In general, show that for any positive integers r 
and m with r > m, 


Ayr —= AT ™(A™ yr) any A™ (AT ™ yx), 


Example 11.2.5 For the tabulated values of y = f(x) find Ay3 and A®yo 


Solution: Here, 


Ays = ya — y3 = 0.49 — 0.35 = 0.14, and 


A®y. = A(A?ye) = A(ya — 2ys + y2) 
(ys — ya) — 2(ya — ys) + (y3 — y2) 
Ys — 3y4 + 3y3 — Yo 
= 0.67-—3x 0.4943 x 0.35 — 0.26 = —0.01. 


Remark 11.2.6 Using mathematical induction, it can be shown that 


AYR = yy (") Yk+j- 


j=0 


Thus the r*} forward difference at yr uses the values at Yr, Ye+1,--+)Yktr- 


Example 11.2.7 If f(x) = 2? + ax +b, where a and b are real constants, calculate A’ f(z). 
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Solution: We first calculate Af (x) as follows: 


Af(z) = f(at+h)— f(x) = [(e@ +h)? + a(z +h) +0] — [x* + a2 +] 
= Qeh+h?+ah. 
Now 
A’ f(z) = Af(e@+h)—Af(z) = [2(e@ +h)h +h? + ah] — [2eh +h? + ah] = 2h?, 
and A®f(c) = A? f(x) — A? f(z) = 2h? — 2h? =0. 


Thus, A’ f(x) = 0 for all r > 3. 


Remark 11.2.8 In general, if f(x) = x” +aye—t aga? +--+ an_1@+an is a polynomial of degree 
n, then it can be shown that 


A’ f(z) =n!lh™ and A" f(x) =0 for r=1,2,.... 
The reader is advised to prove the above statement. 


Remark 11.2.9 1. For a set of tabular values, the horizontal forward difference table is written as: 


ZO Yo Ayo = yi — Yo A?yo = Ayr — Ayo +: A” yo = A™ ty, — A" yo 
v1 yl Ay = y2— yt A?y, = Ay — Ay 


x2 ye Ayo = y3 — y2 A? yo = Ay3 — Aya 


En-1 Yn-1 Ayn-1 = Yn — Yn-1 


xo Yo 
Ayo 
ty Y1 A’ yo 
Ayi A* yo 
v2 ye Ayn 
Ayn-1 
In—-2  Yn—2 A? yn—3 
AYyn—2 AP yn—3 
In-1  Yn-1 A? yn—2 
Ayn-1 
En Yn 


However, in the following, we shall mostly adhere to horizontal form only. 


11.2.2. Backward Difference Operator 


Definition 11.2.10 (First Backward Difference Operator) The FIRST BACKWARD DIFFERENCE OPER- 
ATOR, denoted by V, is defined as 


Vif (@) = f(x) — f(@— h). 


Given the step size h, note that this formula uses the values at x and x — h, the point at the previous 
step. As it moves in the backward direction, it is called the backward difference operator. 
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Definition 11.2.11 (rth Backward Difference Operator) The rth backward difference operator, V", is 
defined as 


V" f(z) = V"* f(z) —V" "f(x —h), fo 1,2,..., 
with Vo F(a) = Ff). 


In particular, for c = xz, we get 
Vuk = Yk —Yr-1 and V7y~ = ye — 2yn—1 + Ye—2- 
Note that V?yz = A?yp_2. 
Example 11.2.12 Using the tabulated values in Example[11.2.5] find Vy4 and V%y3. 
Solution: We have Vy4 = y4 — y3 = 0.49 — 0.35 = 0.14, and 


Vous V7y3 — V7 y2 = (ys — 2y2 + y1) — (y2 — 2y1 + yo) 


I 


y3 — 3y2 + 341 — Yo 
0.35 — 3 x 0.26 +3 x 0.11 — 0.05 = —0.15. 


Example 11.2.13 If f(x) = x? + ax + b, where a and b are real constants, calculate V’ f(x). 


Solution: We first calculate V f(x) as follows: 


Vi(z) = f(a) — f(e—h) = [x* +ar+)] - [(e@—h)? +a(a—h) +] 
= xh—h? +ah. 
Now, 
V° f(x) = Vif (az)— Af(a —h) = [22h — h? + ah] — [2(@ — h)h — A? + ah] = 2h?, 
and =V°f(z) = V*f(e)— V7 f(z) = 2h? — 2h? =0. 


Thus, V" f(x) = 0 for all r > 3. 


as: 
Lo Yo 
1 YL Vy = Y1 — Yo 


r2 y2 Vy2 = y2- Yi V2yo = Vy2 - Vy 


En-1 Yn-1 Vyn-1 = Yn-1 — Yn-2 : 
in Un Vyn = Yn — Yn-1 V7Un = Vyn _ VYn-1 ae V" yn = Vo Yn _ Vt Yn 1 


Example 11.2.15 For the following set of tabular values (x;, y;), write the forward and backward difference 
tables. 


xi | 9 10 11 12 #13~«214 
yi | 50 54 60 68 7.5 87 


Solution: The forward difference table is written as 
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cy Ay A?y A’y Afy A’y 

9 5 04=54-5 02=06-04 0=0.2-0.2 -.3=-0.3-0.0 0.6 = 0.3 - (-0.3) 
10 5.4 0.6 0.2 -0.3 0.3 

11 6.0 0.8 -0.1 0.0 

12 68 0.7 -0.1 

13° 7.5 0.6 

14 81 


In the similar manner, the backward difference table is written as follows: 
x y Vy Vy V3y Vy Vy 
10 54 0.4 
11 6 O6 0.2 
12 68 0.8 0.2 0.0 


13° 7.5 0.7 -0.1 -0.3 -0.3 
14 81 06 -01 0.0 0.3 0.6 


Observe from the above two tables that A?y, = V?y4, A?y3 = V2y5 , Aty1 = V4ys ete. 


Exercise 11.2.16 1. Show that A?y, = V3 yz. 
2. Prove that A(Vyz) = A?yp41 = V7yK-1- 


3. Obtain V*y, in terms of yo, 41, Y2,---5 Ye. Hence show that V¥ yz, = A* yo. 
Remark 11.2.17 In general it can be shown that A* f(x) = V*¥ f(a + kh) or A¥ym = V¥ yxim 


Remark 11.2.18 In view of the remarks (11.2.8) and (11.2.17) it is obvious that, if y = f(x) is a 
polynomial function of degree n, then V" f(x) is constant and V"*" f(a) =0 for r > 0. 


11.2.3. Central Difference Operator 


Definition 11.2.19 (Central Difference Operator) The FIRST CENTRAL DIFFERENCE OPERATOR, de- 


noted by 0, is defined by 


h 


Sf(0) = fet 5) — fle - 5) 


and the r-# CENTRAL DIFFERENCE OPERATOR is defined as 


ofa) = Ofle+p)-oF@—F) 
with O° f(x) = f(x). 


Thus, 6° f(x) = f(a +h) —2f(x) + f(x —h). 
In particular, for = xp, define y,,1 = f(e + 4), and Yea = f(t - 4). then 


OUk = Yep —Ye-4 and 5°Yk = Ye+1 — 2Ye + Yet. 


Thus, 6? uses the table of (ax, yz). It is easy to see that only the even central differences use the tabular 
point values (rx, yx). 
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11.2.4 Shift Operator 


Definition 11.2.20 (Shift Operator) A SHIFT OPERATOR, denoted by £, is the operator which shifts the 
value at the next point with step h, i.e., 


Ef(e) = f(a +h). 
Thus, 
Ey: = yin, E’yi=yiro, and Ey; = yisn. 
11.2.5 Averaging Operator 


Definition 11.2.21 (Averaging Operator) The AVERAGING OPERATOR, denoted by su, gives the average 
value between two central points, i.e., 


nia) = 5[fe+ 2) + fle - 2). 


Thus LY = 5 (Yi44 as Y;—1) and 


2 1 1 
My 3 eae +HY,-3| a [yita + 2yi + yi-1]. 


11.3. Relations between Difference operators 


1. We note that 


Thus, 
or AZB-1. 
2. Further, V(E(f(z)) = V(f(@ +h)) = f(x +h) — f(a). Thus, 
(1— V)Ef(x) = E(f(#)) — VE(F(2)) = fle +h) — [fla +h) — f@)| = f@). 


Thus £ = 1+A, gives us 


(1-—V)1+A)f(x) = f(x) for all x. 


So we write, 
(1+A)+=1-V or | V=1-(1+A)7}, | and 
(l-V)*=1+A=E. 
Similarly, 
A=(1-V)1-1. 
3. Let us denote by E? f(x) = f(x + 4). Then, we see that 


h 1 


Sf(2) = fle +S) — fle— 3) = BYf(a) — B-4f(@), 


Thus, 


Recall, 


5° f(x) = f(a +h) — 2f(x) + f@—h) = [fe +h) + 2f (x) + f@— A)] — 4 f(a) = 4(? — DF (2). 


11.4. NEWTON’S INTERPOLATION FORMULAE 215 


So, we have, 


je Soa or | p=\/1+4 


/ 52 
That is, the action of 4/1+ EE is same as that of wu. 


4. We further note that, 


Af(e) = flath)— fle) = 5[fle +h) —2f(e) + fe h)] + 5LFe +h) — se —A)] 
= 5H (F(2)) + sf +h) — fle —n)] 
and 
supe) = 8|5{ p+ 2) + 4-2) =Sie@ +m) - s@}+ UO - em} 
2 2 2 2 
= F[flet+h)— fle-n)] 
Thus, i 
Aste) = [50° + 5u] £0) 


a lis 2 hs 6? 
A= 50 +6u= 56 + 64/14 : 


In view of the above discussion, we have the following table showing the relations between various 


difference operators: 


E E A+1 (’1-V)-? | $6? 4+6\/14+%41 
A E-1 A (1L-V)-*-1] $6?+6,/1+ $6? 

V 1-7 1-(1+V)" V — 50? + 5/1 + 50? 
6.| BV? —B-2 |) A +A)-/? | Va—Vv)- 6 


Exercise 11.3.1 1. Verify the validity of the above table. 
2. Obtain the relations between the averaging operator and other difference operators. 


3. Find A?y2, V7y2, 67y2 and pu? y2 for the following tabular values: 


zx; | 93.0 96.5 100.0 103.5 107.0 
y | 11.3 125 140 15.2 16.0 


11.4. Newton’s Interpolation Formulae 


As stated earlier, interpolation is the process of approximating a given function, whose values are known 
at N+1 tabular points, by a suitable polynomial, Py (x), of degree N which takes the values y; at « = 2; 
for i = 0,1,...,N. Note that if the given data has errors, it will also be reflected in the polynomial so 
obtained. 

In the following, we shall use forward and backward differences to obtain polynomial function ap- 


proximating y = f(x), when the tabular points x;’s are equally spaced. Let 


f(x) © Py(2), 
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where the polynomial Py (a) is given in the following form: 


Py(%) = ao +ai(a — 4%) + a(x — %0)(a — 41) +--+ + ap (a — 40) (@ — 21) +++ (@ — BR_-1) 
tan (a — %o)(x — 41)-+-- (a — eN_-1). (11.4.1) 
for some constants ao, a1,...ay, to be determined using the fact that Py(#;) = y; fori = 0,1,...,N. 


So, for i = 0, substitute « = ao in (11.4.1) to get Py (xo) = yo. This gives us ap = yo. Next, 


Py(t1) = y1 > y1 = G0 + (21 — Z0)a1. 


So,a, = 4 = a For i= 2, yo = ao + (a2 — xo)a1 + (42 — ©1)(x2 — Xo)ag, or equivalently 
A 
2h az = y2— Yo — 2n(—*) = yo — 2yi + yo = A* yo. 
Thus, a2 = a Now, using mathematical induction, we get 
Af yo 
ak = nF for k =0,1,2,...,N. 
Thus, 
Ayo A? yo Ayo 
Py(xz) = yot —,_(# ~ 20) + ara (e ~ to) (e — #1) fee Tipe fe ~ to) (@— Lk-1) 
AN yo 


Farad ty) n(e— aya): 


As this uses the forward differences, it is called NEWTON’S FORWARD DIFFERENCE FORMULA for inter- 
polation, or simply, forward interpolation formula. 


Exercise 11.4.1 Show that 


3 4 
as = 2H and ag = 3 
and in general, 
ab = aa for k=0,1,2,...,N. 
For the sake of numerical calculations, we give below a convenient form of the forward interpolation 
formula. 
Let u = oo then 


e-— 2, =hut+2%4— (ao +h) =h(u—-1),2-— 22 =h(u—2),...,0-—a, =h(u—k), ete.. 


With this transformation the above forward interpolation formula is simplified to the following form: 


ky, pk 
The {(hu)(h(u — 1))} +--+ Aut [u(u — ieee een es 1)] 


Py(u) 


l| 
< 
i=) 
+ 
= 
£ 
+ 


= ot Ayo(u) + <2 (u(u— 1) +--+ 2 uu 1)---(u— +1) 


AN 
ee eal fu(w~1)..(u- nN +9)}. (11.4.2) 
If N=1, we have a linear interpolation given by 


f(u) © yo + Ayo(u). (11.4.3) 
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For N = 2, we get a quadratic interpolating polynomial: 


f(u) & yo + Ayo(u) + A [u(w — 1)] (11.4.4) 


and so on. 

It may be pointed out here that if f(z) is a polynomial function of degree N then Py (x) coincides 
with f(a) on the given interval. Otherwise, this gives only an approximation to the true values of f(x). 

If we are given additional point xy+1 also, then the error, denoted by Ry(x) = |Pn(x) — f(x)|, is 
estimated by 

AN+1 yo 
Similarly, if we assume, Py (x) is of the form 
Py (a) = bo + b1(a@ — ay) + b1 (a — an) (@ — ey_1) +--+ + bn (a — By) (@ — BN_-1)+-- (@— 214), 


then using the fact that Py(«;) = yi, we have 


bo = YN 
be poy 
b= a YN — YN-1) = i YN 
— yn — 2yn-1+yn-2 1 9 
a. Phe = apa V un) 
TS ete 


Thus, using backward differences and the transformation x = zy + hu, we obtain the Newton’s 


backward interpolation formula as follows: 
u(u + 1 u(u+1)---(wt+N-1 
Py (u) = yn +uVyn + ee oT ) ee a) a 


Exercise 11.4.2 Derive the Newton's backward interpolation formula (11.4.5) for N = 3. 


V2yn tees t+ VY yn. (11.4.5) 


Remark 11.4.3 If the interpolating point lies closer to the beginning of the interval then one uses the 
Newton’s forward formula and if it lies towards the end of the interval then Newton’s backward formula 
is used. 


Remark 11.4.4 For a given set of n tabular points, in general, all the n points need not be used for 
interpolating polynomial. In fact N is so chosen that N th forward/backward difference almost remains 
constant. Thus N is less than or equal to n. 


Example 11.4.5 1. Obtain the Newton's forward interpolating polynomial, P;(2) for the following tab- 
ular data and interpolate the value of the function at « = 0.0045. 


x 0 0.001 0.002 0.003 0.004 0.005 
y 1.121 1.123 1.1255 1.127 1.128 1.1285 
Solution: For this data, we have the Forward difference difference table 
Li Yi Ayi A*ys Aey; Aye DOR 
0 1.121 0.002 0.0005 -0.0015 0.002 = -.0025 
001 1.123 0.0025 -0.0010 0.0005 -0.0005 
002 1.1255 0.0015 -0.0005 0.0 
003. 1.127 0.001 -0.0005 
004 1.128 0.0005 
005 1.1285 
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Thus, for « = x9 + hu, where xp = 0, h = 0.001 and u = = zo. we get 
Ps(x) = 1.121+u x .002+ wu 1) (0005) + wou?) x (—.0015) 
uu = zie 2) (w= 3) ( ggg) 4 B= Du tu SDs Oy 
Thus, 
P5(0.0045) = P5(0+0.001 x 4.5) 
= 1.1214 0.002 x 4.5 + ua x 4.5 x 3.5 — ee x 4.5 x 3.5 x 2.5 


0.002 0.0025 
+ A x 4.5 x 3.5 x 2.5 x 1.5—- 120 


= 1.12840045. 


x 4.5 x 3.5 x 2.5 x 1.5 x 0.5 


2. Using the following table for tan, approximate its value at 0.71. Also, find an error estimate (Note 
tan(0.71) = 0.85953). 


Xi 0.70 72 0.74 0.76 0.78 
tana; | 0.84229 0.87707 0.91309 0.95045 0.98926 


Solution: As the point x = 0.71 lies towards the initial tabular values, we shall use Newton's Forward 
formula. The forward difference table is: 
Xi Yi Ayi Ay; Ary; At y; 

0.70 0.84229 0.03478 0.00124 0.0001 0.00001 

0.72 0.87707 0.03602 0.00134 0.00011 

0.74 0.91309 0.03736 0.00145 

0.76 0.95045 0.03881 

0.78 0.98926 


In the above table, we note that A®y is almost constant, so we shall attempt 3rd degree polynomial 
interpolation. 
0.71 — 0.70 


Note that 29 = 0.70, h = 0.02 gives u = 002 


polynomial of degree 3, we get 


= 0.5. Thus, using forward interpolating 


0.00124 0.0001 
P3(u) = 0.84229 + 0.03478u + rT u(u—1)+ 31 u(u— 1)(u — 2). 


0.00124 
Thus, tan(0.71) = 0.84229 + 0.03478(0.5) + a 0.5 x (—0.5) 
0.0001 
aps 0.5 x (—0.5) x (—1.5) 
= 0.859535. 


An error estimate for the approximate value is 


A*yo 
4! 


u(u — 1)(u— 2)(u— 3) = 0.00000039. 

u=0.5 
Note that exact value of tan(0.71) (upto 5 decimal place) is 0.85953. and the approximate value, 
obtained using the Newton's interpolating polynomial is very close to this value. This is also reflected 


by the error estimate given above. 
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3. Apply 3rd degree interpolation polynomial for the set of values given in Example[11.2.15| to estimate 
the value of (10.3) by taking 


Also, find approximate value of f (13.5). 

Solution: Note that « = 10.3 is closer to the values lying in the beginning of tabular values, while 
x = 13.5 is towards the end of tabular values. Therefore, we shall use forward difference formula for 
x = 10.3 and the backward difference formula for x = 13.5. Recall that the interpolating polynomial 
of degree 3 is given by 


A? yo 
2! 


f (xo + hu) = yo + Ayou + 


Therefore, 


10.3 — 9.0 


(a) for ro = 9.0, h = 1.0 and x = 10.3, we have u = i 


= 1.3. This gives, 


2 ; 
f(10.3) ~ 54.4 13+ a3) x 3+ (13) x .3 x (—0.7) 


= 5.559. 
10.3 — 10. 
(b) for xo = 10.0, h = 1.0 and x = 10.3, we have u = a = .3. This gives, 
2 —0.3 
f(10.3) 54+ .6 x 34 5(.3) x (-0.7) + A(8) x (-0.7) x (-L.7) 


5.04115. 


Note: as x = 10.3 is closer to x = 10.0, we may expect estimate calculated using zp = 10.0 to 
be a better approximation. 
(c) for x9 = 13.5, we use the backward interpolating polynomial, which gives, 


APyn 
3! 


V7yNn 
2! 


f(anw + hu) yo t+ Vynut u(u+1)+ u(u + 1)(u + 2). 


13.5— 14 
Therefore, taking zy = 14, h=1.0 and x = 13.5, we have u = “es = —0.5. This gives, 


f(IB5) © 8.14.6 (-0.5) + (0.5) x 0.5 + (0.5) x 0.5 x (1.5) 


2! 3! 
7.8125. 


Exercise 11.4.6 — 1. Following data is available for a function y = f(x) 
x 0 0.2 0.4 0.6 0.8 1.0 
y 10 0.808 0.664 0.616 0.712 1.0 


Compute the value of the function at « = 0.3 and « = 1.1 


2. The speed of a train, running between two station is measured at different distances from the starting 
station. If x is the distance in km. from the starting station, then v(x), the speed (in km/hr) of the 
train at the distance x is given by the following table: 

x 0 50 100 150 200 250 
v(x) 0 60 80 110 90 0 


Find the approximate speed of the train at the mid point between the two stations. 
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3. Following table gives the values of the function S(x) = f sin(Zt?)dt at the different values of the 
0 


tabular points 2, 
x 0 0.04 0.08 0.12 0.16 0.20 
S(x) 0 0.00003 0.00026 0.00090 0.00214 0.00419 
Obtain a fifth degree interpolating polynomial for S(2). Compute $0.02) and also find an error estimate 
for it. 


. Following data gives the temperatures (in °C’) between 8.00 am to 8.00 pm. on May 10, 2005 in 


Kanpur: 
Time 8am 12noon 4pm _ 8pm 
Temperature 30 37 43 38 


Obtain Newton’s backward interpolating polynomial of degree 3 to compute the temperature in Kanpur 
on that day at 5.00 pm. 


Chapter 12 


Lagrange’s Interpolation Formula 


12.1 Introduction 


In the previous chapter, we derived the interpolation formula when the values of the function are given 
at equidistant tabular points x0, 21,...,2£~. However, it is not always possible to obtain values of the 
function, y = f(a) at equidistant interval points, x;. In view of this, it is desirable to derive an in- 
terpolating formula, which is applicable even for unequally distant points. Lagrange’s Formula is one 
such interpolating formula. Unlike the previous interpolating formulas, it does not use the notion of 


differences, however we shall introduce the concept of divided differences before coming to it. 


12.2 Divided Differences 


Definition 12.2.1 (First Divided Difference) The ratio 
F (xi) — f(x;) 
Li Xj 
for any two points x; and 2, is called the FIRST DIVIDED DIFFERENCE of f(z) relative to x; and 2;. It is 


denoted by d[x;, ,]. 


Let us assume that the function y = f(x) is linear. Then 6[z;,x,] is constant for any two tabular 


points x; and xj, 1.e., it is independent of x; and x;. Hence, 


f (xi) = f(@3) 


Li Xi 


b(t, @5] = 


= 6[x;,a;]. 


Thus, for a linear function f(a), if we take the points x, xv and 2; then, 6[xo, x] = 6[xo0, x1], «e., 


fe) = fu = 6[x0, 21]. 
Thus, f(x) = f(xo) + (@ — x0) 6[x0, £1]. 

So, if f(a) is approximated with a linear polynomial, then the value of the function at any point x 
can be calculated by using f(x) ~ Pi(x) = f (xo) + (@ — x)d[x0, 21], where d[x%o, x1] is the first divided 
difference of f relative to x and 2}. 


Definition 12.2.2 (Second Divided Difference) The ratio 


6 j =O) 449 
bee = ett 


is defined as SECOND DIVIDED DIFFERENCE of f(z) relative to x;, 7; and xp. 


991 
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If f(x) is a second degree polynomial then 6[z0, z] is a linear function of x. Hence, 


6[z;, Lx] = 6[x5, 25] 


67:25, 2%] = is constant. 


Ue — Uj 
In view of the above, for a polynomial function of degree 2, we have 6[x, x9, 41] = 6[%0, %1, 22]. Thus, 


6 ; a) ? 
Mel Me — a, 


This gives, 


d(x, xo] = [0,21] + (a — ©1)d[x0, 41, X9). 
From this we obtain, 
f(x) = f(xo) + (@ — %0)6[x0, x1] + (a — x0)(u — #1) 6[x0, 11, La]. 


So, whenever f(x) is approximated with a second degree polynomial, the value of f(a) at any point 


x can be computed using the above polynomial, which uses the values at three points x9, 271 and 22. 


Example 12.2.3 Using the following tabular values for a function y = f(x), obtain its second degree poly- 
nomial approximation. 


Also, find the approximate value of the function at « = 0.13. 
Solution: We shall first calculate the desired divided differences. 


5[xo, 21] (1.24 — 1.12)/(0.16 — 0.1) = 2, 


d[a1,a2] = (1.40—1.24)/(0.2—0.16)=4, and 


5[x0, 21, 22] ea Sal ue = (4—2)/(0.2— 0.1) = 20. 


Thus, 


f(a) & Px(x) = 1.12 + 2(@ — 0.1) + 20(a — 0.1)(x — 0.16). 


Therefore 


f (0.13) ~ 1.12 + 2(0.13 — 0.1) + 20(0.13 — 0.1)(0.13 — 0.16) = 1.162. 


Exercise 12.2.4 1. Using the following table, which gives values of log(x) corresponding to certain values 
of x, find approximate value of log(323.5) with the help of a second degree polynomial. 


x 322.8 3242 325 
log(x) | 2.50893 2.51081 2.5118 


2. Show that 
f (xo) f(x1) f (x2) 


ea Gals (to — £1)(to —%2) (1 — to)(@1— 42) — (€2 — Bo) (42 — 21) 


So, 6[2o0, 21, £2] = 6[%0, £2, 21] = 6[@1, X90, 2] = O[a1, 2, Xo] = Ole, 70, 41] = 46[@2, 21, Zo]. That is, 
the second divided difference remains unchanged regardless of how its arguments are interchanged. 
A?yo _ V7 yo 


3. Show that for equidistant points x, 71 and x2, 6[%0,%1, 22] = aT ce 


where yx = f(x), 
and h=2,—-% =%- 17}. 
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4. Show that for a linear function, the second divided difference with respect to any three points, 2;, 2; 


and xx, is always zero. 


Now, we define the kth divided difference. 


Definition 12.2.5 (kt Divided Difference) The k7# pivipED DIFFERENCE of f(x) relative to the tab- 


ular points tp, 21,...,2x, is defined recursively as 
6[%1,%2,...,U%] — O[@0,1,---, %e—-1] 
6[20, 21, --- 5 &_) = <a oa 
Lk — XO 


It can be shown by mathematical induction that for equidistant points, 


Atyo — Ve yk 
6(0,21,---, 2k = FA = klhk (12.2.1) 
where, yo = f(a), and h = 4 —% = %2— 21 =+++ = XE — Lp-1. 
In general, 
5 Bee 
[vi Ti4a,--->Litn] = mh 


where y; = f(a;) and h is the length of the interval for i = 0,1,2,.... 


Remark 12.2.6 In view of the remark (11.2.18) and (12.2.1), it is easily seen that for a polynomial 
function of degree n, the nth divided difference is constant and the (n+ 1)th divided difference is zero. 


Example 12.2.7 Show that f(a) can be written as 
f(x) = f(xo) + 4[x0, 21](z — xo) + 4[z, 0, 21](z — xo)(z — 21). 
Solution:By definition, we have 


6[x, 20,21] = naa 


“| 


so, 0[x, Xo] = 6[20, 21] + (a — 20) d[x, 20, 21]. Now since, 


f(x) — f (xo) 


(a — x0) 


d(x, xo] = 


9 


we get the desired result. 


Exercise 12.2.8 Show that f(x) can be written in the following form: 
f(x) = Po(a) + Ra(x), 


where, Po(x) = f(a) + 6[20, 21](@ — Xo) + 6[20, #1, L2|(a — 2o)(x — 21) 
and R3(a) = d(x, v9, @1, L2|(x — vo) (a — 21) (x — XQ). 
Further show that P2(x;) = f(x;) for i = 0,1. 


Remark 12.2.9 In general it can be shown that f(x) = P,(x) + Rn4i(x), where, 


Pr(xz) = f(vo)+d[xo,21](@ — 20) + 6[x0, 21, v2] (x — &o)(@ — 41) + °°: 
+6[@9,01,2,...,Xn](@ — xo) (a — 11) (a — &2) +++ (@ — &p_1), 
and Ry41(x) = (a — ao) (@ — 21) (a — &g) +++ (@ — &p) O[x, 0, 21, V2,-.--, Ln]. 


Here, Rn+1(x) is called the remainder term. 

It may be observed here that the expression P,,(x) is a polynomial of degree 'n' and P,(x;) = f (xi) 
fori =0,1,--- ,(n—1). 

Further, if f(x) is a polynomial of degree n, then in view of the Remark[I2.2.6) the remainder term, 
Rn+i(x) = 0, as it is a multiple of the (n+ 1)th divided difference, which is 0. 
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12.3. Lagrange’s Interpolation formula 


In this section, we shall obtain an interpolating polynomial when the given data has unequal tabular 
points. However, before going to that, we see below an important result. 


Theorem 12.3.1 The k*" divided difference 6[@0,@1,---, 2%] can be written as: 


0,01 Lk = ae ft) 
nh eae ee 
. f(x) 
Te aay ey a) 
IO sft I) ng eg, 
TI (20 ~ 2) Te) IL-3) 


Proor. We will prove the result by induction on k. The result is trivially true for k = 0. For k = 1, 


ese To). A), I) 
a t1 — Xo £9 — 24 £1 — Lo 


Let us assume that the result is true for k = n, 1.e., 


f(zo) ri (vs) 
(wo — %1)(%0 — @2)---(@o — Xn) = (41 — Bo) (41 — L2)-+ + (H1 — Fn) 
f(an) 


Te (fn — €0)(@n — £1) +++ (Ln — Fn—1) 


Consider k = n+ 1, then the (n + 1)th divided difference is 


6[@0,%1,---,2nqi] = as — tle Sn 
= 1 f(x1) f (x2) 
~ n41— 20 | (21 — 22)+++ (21 — En41) (x2 — %1)(@2 — @3) +++ (a2 — are al 
f(&n+41) = 1 f (xo) 
(@n41 — 21)+++ (@n41 — Ln) Lnt1 — Lo | (Lo — £1) +--+ (%o — Ln) 
f(x1) . f(an) 
m= nana) Ge en ee 


which on rearranging the terms gives the desired result. Therefore, by mathematical induction, the 


proof of the theorem is complete. 


Remark 12.3.2 In view of the theorem [12.3.1] the k“ divided difference of a function f(x), remains 
unchanged regardless of how its arguments are interchanged, i.e., it is independent of the order of its 


arguments. 


Now, if a function is approximated by a polynomial of degree n, then , its (n+ 1)th divided difference 
relative to 2,2,%1,---,2n Will be zero,(Remark (12.2.6) 1.e., 


6[@, ©, 21,---,%n] = 0 
Using this result, Theorem [12.3.1] gives 
f(z) . f(ao) 
(a —2x0)(w@—21)-+--(@—an) (0 — X)(X0 — 21) --+ (Xo — In) 
f(vs) ee Fan) ¥ 


(a1 — x)(x1 — %2)-+- (a1 — Bn) 
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or, 
f(x) a J (xo) a (x1) 
(w — wo)(% — 1)-+-(@ — En) (to — &)(%o — %1)++-(%o— Hn) — (1 — &)(B1 — Bo)(1 — L2)-++ (G1 — Gn) 
a f (tn) 
+S Gy a)(@n — 20) (Bn — Bat) 


fd Pe op tps ‘ [I (@ ~ 23) 
- s( I] 7 f(s) = > —* fw) 
i=0 \j=0, 541 ~* J i ; : : 


= [[@-2,) > —)___ 
iI 


j=0 i=0 (x — xi) 


Note that the expression on the right is a polynomial of degree n and takes the value f(x;) at x = 2; 
for i =0,1,--- ,(n—1). 
This polynomial approximation is called LAGRANGE’S INTERPOLATION FORMULA. 


Remark 12.3.3 In view of the Remark (12.2.9), we can observe that P,,(x) is another form of Lagrange’s 
Interpolation polynomial formula as obtained above. Also the remainder term Ry+1 gives an estimate 


of error between the true value and the interpolated value of the function. 


Remark 12.3.4 We have seen earlier that the divided differences are independent of the order of its 
arguments. As the Lagrange’s formula has been derived using the divided differences, it is not necessary 
here to have the tabular points in the increasing order. Thus one can use Lagrange’s formula even 
when the points %0,21,°** ,2k,°** ,Xn are in any order, which was not possible in the case of Newton’s 
Difference formulae. 


Remark 12.3.5 One can also use the Lagrange’s Interpolating Formula to compute the value of x for 
a given value of y = f(x). This is done by interchanging the roles of x and y, 1.e. while using the table 
of values, we take tabular points as yz and nodal points are taken as xr. 


Example 12.3.6 Using the following data, find by Lagrange’s formula, the value of f(x) at x = 10: 


Also find the value of a where f(x) = 16.00. 
Solution: To compute f(10), we first calculate the following products: 


4 4 
[[@-=,) = []@0- 2) 
i= 


j= 


oO 


= o 9.3)(10 — 9.6)(10 — 10.2)(10 — 10.4)(10 — 10.8) = —0.01792, 


n n 


LQ XG = . ; 21 — Lj) = —vV. 5 t2—@i) = Vz ; 

; 0.4455 ; 0.1728 3) = 0.0648 
j=1 j=0, jA1 j=0, #2 

]] (@s-2;)) = 0.0704, and ][ (4-2;) = 0.4320. 


=0, 743 j=0, j#4 
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Thus, 
11.40 12.80 14.70 
10) x —0.01792 eS 
F(10) a 0.7 x 0.4455 = 0.4 x (—0.1728) = (—0.2) x 0.0648 
17.00 es 19.80 
(—0.4) x (—0.0704) — (—0.8) x 0.4320 
= 13.197845. 


Now to find the value of a such that f(a) = 16, we interchange the roles of x and y and calculate the 


following products: 


4 
He- yj) 


4 
Il 16 — y;) 
j=0 


= (16 —11.4)(16 — 12.8)(16 — 14.7)(16 — 17.0)(16 — 19.8) = 72.7168, 


n n 


[[@o-y) = 217.3248, [J (-y;)=—-78.204, J] (ye—y;) = 73.5471, 
j=l j=0, jA1 j=0, #2 
I] @s-y;) = —151.4688, and I] (— ys) = 839.664. 
jJ=0, #3 j=0, j#4 


Thus,the required value of x is obtained as: 


9.3 9.6 10.2 
Me OER Li x 217.3248 * 3.2x (78.204) | 13% 73.5471 
10.40 10.80 
+10) x (151.4688) + (3.8) x aT 
~ 10.39123. 


Exercise 12.3.7 The following table gives the data for steam pressure P vs temperature T: 


7 
f= (h) 


360 
154.0 


365 
165.0 


373 
190.0 


383 
210.0 


390 
240.0 


Compute the pressure at 7’ = 375. 


Exercise 12.3.8 Compute from following table the value of y for x = 6.20: 


5.60 
2.30 


5.90 
1.80 


6.50 
1.35 


6.90 
1.95 


7.20 
2.00 


Also find the value of « where y = 1.00 


12.4 Gauss’s and Stirling’s Formulas 


In case of equidistant tabular points a convenient form for interpolating polynomial can be derived from 
Lagrange’s interpolating polynomial. The process involves renaming or re-designating the tabular points. 
We illustrate it by deriving the interpolating formula for 6 tabular points. This can be generalized for 
more number of points. Let the given tabular points be 29,271 = %9 +h, 22g = %p —h, x23 = 49 + 2h, 44 = 
Xp — 2h,x25 = Xo + 3h. These six points in the given order are not equidistant. We re-designate them 


for the sake of convenience as follows: @*4 = %4,%% 1 = 2,%§ = %0, U7 = 11,03 = ©3,x3 = 5. These 
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re-designated tabular points in their given order are equidistant. Now recall from remark (12.3.3) that 
Lagrange’s interpolating polynomial can also be written as : 


f(a) &  f(xo) + 6[x0, 21](a@ — 20) + 6/0, 21, 2] (x — v9) (a — 21) 


+6[x9, 01, U2,U3](u@ — xo) (a — 21)(u — x2) 


+6[x9, 11, 02,03, U4](4@ — x) (@ — 21)(a — wg) (a — x3) 


+6[x0, 21, %2,%3, 24, 5|(@ — xo)(% — 21)(x — X2)(u — x3)(a — x4), 
which on using the re-designated points give: 


f(a) f(a) + Slag, 27]( — 28) + dled, af, 2" 4](@ — 2§)(x — 27) 


+6[x9, 21, 0-1, 2](a — x9)(@ — xy )(a — #2) 


+6[x9, 27,021, €2,@29|(x — xp )(a — @1)(@ — w2,)(a — 73) 


+6[e},cf,0° 1,053,279, 08](w — §)(a — 23)(e — e*4)(e — 23)(a — 2%), 


Now note that the points 2*,,2*,,%5,x2],25 and x3 are equidistant and the divided difference are 
independent of the order of their arguments. Thus, we have 


Ay 
h > 


ra ey 
Qh2 ’ 


a 


9,0] = 


6x5, 21,024] = dla* 1,29, 24] = 
* * * * * * Ayr; 
0) ©g, 21,0" 1,25] = é[x= 1,25, 2}, 29] = “SI3 
Aty*s 
Alh4 
AP y* 5 
BIR” 


6 ok * es * * =6 * * * * *] 
Lo, 01,2" 14,05, 2" 9] = [en 5,24, 25, 21,2] _ 


* * * * * * * * * * * * 
6 Eo, 1,2" 4,05, 2" 9,253] = 6[x* 5,0" 1,29, 21,25, 23] = 


where y7 = f(x¥) for 1 = —2,—1,0,1,2. Now using the above relations and the transformation x = 
x5 + hu, we get 


* ey * Ay Aey* i My? 
f(ejthu) = yot+ —, (hu) + pa (hu)(hu — h) + 3In3 (hu)(hu — h)(hu + h) 
A4 * 
a nn (hu)(hu — h)(hu + h)(hu — 2h) 
Ady" 2 
+p (hu) (hu —h)(hu + h)(hu — 2h)(hu + 2h). 


Thus we get the following form of interpolating polynomial 


A2 * A3 * 
flep thu) & yp +udy§ +u(u— = +u(u? - 1) 
Aty* Ady’ 
+u(u? — 1)(u — 2) a? + wu? = 1)(u? = 2°) ==, (12.4.1) 


Similarly using the tabular points x9, 271 = to—h, x2 = toth, x3 = To—2h, x4 = VO+2h, x5 = Xo—3h, and 


the re-designating them, as 7* 3, %*5,2%* 4,24, xj and x3, we get another form of interpolating polynomial 


as follows: 
A?2 * A3 * 
fletthu) ~ yf +udyt,+u(u+1) an +u(u? —1) — 
A4 * A° * 
+u(u? — 1)(u +2) - b u(u? — 1)(u2 2). (12.4.2) 
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Now taking the average of the two interpoating polynomials (12.4.1) and (12.4.2) (called GAuss’s FIRST 
AND SECOND INTERPOLATING FORMULAS, respectively), we obtain Sterling’s Formula of interpolation: 


flag thu) + yotu wr 5 31 


u(u? — 1)(u? — 2?) [Ae a a2] 


Ay*, + Ayg 4 pA’y*, — u(u? — 1) ae + — 


(12.4.3) 


These are very helpful when, the point of interpolation lies near the middle of the interpolating interval. 
In this case one usually writes the diagonal form of the difference table. 


Example 12.4.1 Using the following data, find by Sterling's formula, the value of f(a) = cot(ra) at x = 
0.225 : 


0.24 
1.06489 


0.20 
1.37638 


0.21 
1.28919 


0.22 
1.20879 


0.23 
1.13427 


x 

f(z) 

Here the point x = 0.225 lies near the central tabular point x = 0.22. Thus , we define x_2 = 0.20, 7_1 = 
0.21, 29 = 0.22, x; = 0.23, x2 = 0.24, to get the difference table in diagonal form as: 


t_2=0.20  y_ = 1.37638 
Ay—2 = —.08719 


v1 =.021 y_1 = 1.28919 A?y_9 = .00679 
Ay_1 = —.08040 A’y_2 = —.00091 
xo = 0.22 yo = 1.20879 A?y_1 = .00588 A*y_2 = .00017 
Ayo = —.07452 A?y_1 = —.00074 
x1 = 0.23 yi = 1.13427 A?*yo = .00514 
Ay1 = —.06938 
z2=0.24 yo = 1.06489 
(here, Ayo = yi — yo = 1.13427 — 1.20879 = —.07452; Ay_; = 1.20879 — 1.28919 = —0.08040; and 
A?y_y = Ayo = Ay-1 = .00588, etc.). 
: fo} : 0.225 — 0.22 
Using the Sterling’s formula with u = aon 0.5, we get f (0.225) as follows: 
—.08040 — .07452 .00588 
f(0.225) = 1.20879+ ne cs (0.5) —— 
—0.5(0.5? — 1) (—.00091 — .00074) 5, 5 .00017 
= 1.1708457 


Note that tabulated value of cot(wx) at x = 0.225 is 1.1708496. 


Exercise 12.4.2 Compute from the following table the value of y for x = 0.05 : 


0.08 
0.09007 


0.00 
0.00000 


0.02 
0.02256 


0.04 
0.04511 


0.06 
0.06762 


x 


y 


Chapter 13 


Numerical Differentiation and 


Integration 


13.1 Introduction 


Numerical differentiation/ integration is the process of computing the value of the derivative of a function, 
whose analytical expression is not available, but is specified through a set of values at certain tabular 
points %9,2%1,--- ,%» In such cases, we first determine an interpolating polynomial approximating the 
function (either on the whole interval or in sub-intervals) and then differentiate /integrate this polynomial 
to approximately compute the value of the derivative at the given point. 


13.2. Numerical Differentiation 


In the case of differentiation, we first write the interpolating formula on the interval (%o,2,). and the 
differentiate the polynomial term by term to get an approximated polynomial to the derivative of the 
function. When the tabular points are equidistant, one uses either the Newton’s Forward/ Backward 
Formula or Sterling’s Formula; otherwise Lagrange’s formula is used. Newton’s Forward/ Backward 
formula is used depending upon the location of the point at which the derivative is to be computed. In 
case the given point is near the mid point of the interval, Sterling’s formula can be used. We illustrate 
the process by taking (i) Newton’s Forward formula, and (ii) Sterling’s formula. 


Recall, that the Newton’s forward interpolating polynomial is given by 


2 k 
f(x) = f(t@o thu) & ee Cen ee) eee OO ity ee Gk RA) 


2! 
A” yo 


ferret 
n! 


Differentiating (13.2.1), we get the approximate value of the first derivative at x as 


df df 1 A 
hae 7 [aw + SP 


, At yo (mur! _ n=)? n-2 feet (-1IO-Y(n — 1)! » (13.2.2) 


wv — XO 
i ‘ 


where, u = 
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Thus, an approximation to the value of first derivative at x = x i.e. u = 0 is obtained as : 


1 A? yo Ayo A” yo 
2 _ ete (1 |. 13.2. 
; [Au aa +(-1) (13.2.3) 


Ay* , + Ayo ve Ay 5 u(u? =) Ae y* > ae Aby* , 


Flag the) = y+ 2 7 2! > a a 3! 
A4y* ia 2 _ 92) Ada* Adoy* 
situ? = 1) Yia , wlul— I(ul = 2°) A’yis + Al ym2 +++5 (13.2.4) 
4) 2 5! 
Therefore, 
Ay* Ay* 2 1 A3 * A3 * 
ge id Ji Yit Y 4 Ayr, + SY Fags yng + Ay") 
dx hdu h 2 2 3! 
At * 5 4 15 2 4 A° * A° * 
+2u(2u® — 1) 2 Ce ees Ns) ie | (13.2.5) 


Thus, the derivative at x = xj is obtained as: 


— => = | C—O ce see . 1 an 
dit | py 7 2 a 3I 7 2x Bl ‘ eee) 


Remark 13.2.1 Numerical differentiation using Stirling’s formula is found to be more accurate than 


that with the Newton’s difference formulae. Also it is more convenient to use. 


Now higher derivatives can be found by successively differentiating the interpolating polynomials. Thus 
e.g. using (13.2.2), we get the second derivative at 7 = xg as 


1 ; i 2x 11 x Atyo 
al yo-A Yo + ——qr—— 


@w=Zo 
Example 13.2.2 Compute from following table the value of the derivative of y = f(x) at x = 1.7489: 


x 1.73 1.74 1.75 1.76 1.77 
y  1.772844100 = 1.155204006 = 1.737739435 = 1.720448638 1.703329888 


Solution: We note here that 7 = 1.75,h = 0.01, so u = (1.7489 — 1.75)/0.01 = —0.11, and Ayo = 
—.0017290797, A?y9 = .0000172047, A?yo = —.0000001712, 

Ay; = —.0017464571, A?y_; = .0000173774, A?y_1; = —.0000001727, 

A®y_» = —.0000001749, A+y_2 = —.0000000022 

Thus, f’(1.7489) is obtained as: 

(i) Using Newton's Forward difference formula, 


0.0000172047 
2 
—0.0000001712 


+ (3x (—0.11)? — 6 x —0.11 + 2) x ——J,-—— = —0.173965150143. 


1 
f'(.4978) & |-0.0017290797 + (2x -0.11-1) x 


(ii) Using Stirling’s formula, we get: 
1 | (—.0017464571) + (—.0017290797) 
.O1 2 
(3 x (—0.11)? — 1) ((—.0000001749) + (—.0000001727)) 
2 3! 


-. 22 
+ 2x (—0.11) x (2(-0.11)? — 1) x con 


f'(1.4978) & + (—0.11) x .0000173774 


= —0.17396520185 
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x 


It may be pointed out here that the above table is for f(x) = e~*, whose derivative has the value 


-0.1739652000 at x = 1.7489. 


Example 13.2.3 Using only the first term in the formula (13.2.6) show that 


Hence compute from following table the value of the derivative of y = e” atx =1.15: 


x 1.05 1.15 1.25 
e* 2.8577 3.1582 3.4903 


Solution: Truncating the formula (13.2.6)after the first term, we get: 


i ad 


r 


fai) * = 


D 
(yo — y=1) + (yi — YO) 
ah 
os Yi — Yn 
a 2h 


Now from the given table, taking xj = 1.15, we have 


4903 — 2.8577 
fi! (1.15) © eet = 3.1630. 


Note the error between the computed value and the true value is 3.1630 — 3.1582 = 0.0048. 


Exercise 13.2.4 Retaining only the first two terms in the formula (13.2.3), show that 


—3yo + 4y1 — Y2 
/ NY 


Hence compute the derivative of y = e” at x = 1.15 from the following table: 


x 1.15 1.20 1.25 
e* 3.1582 3.3201 3.4903 


Also compare your result with the computed value in the example (13.2.3). 


Exercise 13.2.5 Retaining only the first two terms in the formula (13.2.6), show that 


hte, USS OU a OU 
Hence compute from following table the value of the derivative of y =e” at x = 1.15: 
x 1.05 1.10 1.15 1.20 1.25 
e* 2.8577 3.0042 3.1582 3.3201 3.4903 
Exercise 13.2.6 Following table gives the values of y = f(a) at the tabular points « = 0+ 0.05 x k, 
k =0,1,2,3,4,5. 


x 0.00 0.05 0.10 0.15 0.20 0.25 
y 0.00000 0.10017 0.20134 0.30452 0.41075 0.52110 


Compute (i)the derivatives y/ and y/ at x = 0.0 by using the formula (13.2.2). (ii)the second derivative y// 
at x = 0.1 by using the formula (13.2.6). 
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Similarly, if we have tabular points which are not equidistant, one can use Lagrange’s interpolating 
polynomial, which is differentiated to get an estimate of first derivative. We shall see the result for 


our tabular points and then give the general formula. Let %o,%1, %2, 3 be the tabular points, then the 
f bul i d then give the g 1 fe la. L be the tabul i hen th 
corresponding Lagrange’s formula gives us: 
~ _(@—21)(% — &2)(x — 23) (x — xo)(x — 2)(u — @3) 
er (xo — £1)(%0 — ©2)(t0 — #3) ease (x1 — £0)(x1 — ©2)(r1 — @3) oe 
(a — xo)(a — x1) (x — x3) (a — o)(x — 41)(a% — x2) 
ae Xo) (a2 — #1) (x2 a 7 (x3 — %0)(a3 — ©1)(x3 — £2) 


Differentiation of the above interpolating polynomial gives: 


f (x3) 


df (x) ~ (a — x2)(a — @3) + me ta) 9 
dx (ao — £1)(%0 — £2)(x0 — x3) 
__ (v= 2)(@ — @3) + (@ — B0)(e — 22) + (Z — Zo)(z — 4) - 
(ex — Bo) — ea) er 29) we 
__ (v— 21)(@ — v2) + (@ — Z0)(x — 21) + ( — Zo) (z — 28) = 
(ea — o)(u2 — 2i)(@2 — 28) oe 
__ (@—21)(@ — &2) + (& — Z0)(@ — 22) + ( — Z0)(e — 21) 


(a3 — %0)(@3 — £1)(x3 — £2) 


3 3 3 
= ([](@-2)) fe) __[ s. : (13.2.7) 
(xi — 23) niet 


In particular, the value of the derivative at x = xo is given by 


df 
dx 


L=xo 


1 1 1 (xo — &2)(xo — £3) 
—s a | (0) + -m@i-eom a 


0-21) (to-—2%2) (ao -— 243 “1 — £o)(@1 — L2)(x1 — w3) 


(ao — ©1)(%0 — x3) (ao — £1)(%0 — 2) 
* Ca Xo)(x2 — £1) (x2 Pas ss 


y Fas). 


(x3 — x0) (%3 — 21)(x3 — Ze 


Now, generalizing Equation (13.2.7) for n+ 1 tabular points ro, 21,--- ,@n we get: 


d = = Li 7 1 
= = [[(@-2,) » am = Ss an 
r=0 i=0 (a — 2;) aL — 25) \k=0, k#i 


Example 13.2.7 Compute from following table the value of the derivative of y = f(x) atx =0.6: 


x 0.4 0.6 0.7 
y 3.3836494 4.2442376 4.7275054 


Solution: The given tabular points are not equidistant, so we use Lagrange’s interpolating polynomial with 


three points: 29 = 0.4,2, = 0.6, 22 = 0.7 . Now differentiating this polynomial the derivative of the function 
at 2 = 2 Is obtained in the following form: 


df 
dx 


(v1 — @2) 
w=z,  (%0—%1)(%o0 — 22 


. 1 1 - (a1 — Xo) - 


21 —%2) (#1 — 20 x2 — Xo)(@2 — 11 
Note: The reader is advised to derive the above expression. 
Now, using the values from the table, we get: 


df 7 (0.6 — 0.7) 
dz|,-o¢ (0.4—0.6)(0.4—0.7) 
(0.6 — 0.4) 
(0.7 — 0.4)(0.7 — 0.6) 
= —5.63941567 — 21.221188 + 31.48336933 = 4.6227656. 


x 3.3836494 + | x 4.2442376 


1 1 
(060.7) | 06-04) 


+ x 4.7225054 
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For the sake of comparison, it may be pointed out here that the above table is for the function f(a) = 2e*+2, 
and the value of its derivative at x = 0.6 is 4.6442376. 


Exercise 13.2.8 For the function, whose tabular values are given in the above example(13.2.8), compute the 


value of its derivative at « = 0.5. 


Remark 13.2.9 It may be remarked here that the numerical differentiation for higher derivatives does 


not give very accurate results and so is not much preferred. 


13.3. Numerical Integration 


b 
Numerical Integration is the process of computing the value of a definite integral, [ f(x)dxz, when 


the values of the integrand function, y = f(a) are given at some tabular points. As in the case of 
Numerical differentiation, here also the integrand is first replaced with an interpolating polynomial, 
and then the integrating polynomial is integrated to compute the value of the definite integral. This 
gives us quadrature formula’ for numerical integration. In the case of equidistant tabular points, either 
the Newton’s formulae or Stirling’s formula are used. Otherwise, one uses Lagrange’s formula for the 
interpolating polynomial. We shall consider below the case of equidistant points: %9,21,°-- ,2n. 


13.3.1 A General Quadrature Formula 


Let f(a) = yr be the nodal value at the tabular point x, for k = 0,1,---,vp, where 7 = a and 
In = Lo + nh = b. Now, a general quadrature formula is obtained by replacing the integrand by 
Newton’s forward difference interpolating polynomial. Thus, we get, 


This on using the transformation x = xp + hu gives: 


b n 
A? A3 
[fae = n | E + uAyo + - u(u—1)+ nia u(u — 1)(u — 2) 
a 0 
A*yo 


+ 


rH u(u=1)(u=2)(u—3) + | du 


which on term by term integration gives, 


2 2 3 2 3 4 
[ fod = jh [run + Sw + ae (= - =) + Avo (=n +n?) 


At 5 3 4 11 3 
+S (5 - BE a) s...| (13.3.1) 


For n = 1, i.e., when linear interpolating polynomial is used then, we have 


b 
[ fae =h E + =) = - [yo + yi] - (13.3.2) 


a 
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Similarly, using interpolating polynomial of degree 2 (7.e. n = 2), we obtain, 


2 
[ fod h 2% + 2Ayo + (5 - 5) a 


h 
= = [yo + 4y1 + yo] - (13.3.3) 


1 " aod 
3 


= 2h luo + (in — wo) +5 5 


In the above we have replaced the integrand by an interpolating polynomial over the whole interval 
[a,b] and then integrated it term by term. However, this process is not very useful. More useful 
Numerical integral formulae are obtained by dividing the interval [a,b] in n sub-intervals [x,, 7,41], 


where, , = 29 + kh for k= 0,1,--- ,n with rp =a,¢%, =X +nh=b. 


13.3.2 Trapezoidal Rule 


Here, the integral is computed on each of the sub-intervals by using linear interpolating formula, i.e. for 
n = 1 and then summing them up to obtain the desired integral. 


Note that 
| rey = | rea + | ree eee / f(a)da +++++ f stone 


Now using the formula ([13.3.2) for n = 1 on the interval [x,, 7441], we get, 


Le41 


f(x)dx = * [ye + Y+1] - 
Ee 
Thus, we have, 
b 
[flea =F luo + on) + 5 lon ve) +o 5 lot asad #2 +S loan + Yad +5 [Yaa +t 
1.€. 
b 
| toae = Myo + 2yn + Qe to + Qe be + naa + oe] 
= i mote Sy, (13.3.4) 

2 


i=1 


This is called TRAPEZOIDAL RULE. It is a simple quadrature formula, but is not very accurate. 


Remark 13.3.1 An estimate for the error E; in numerical integration using the Trapezoidal rule is 


given by 


where A?y is the average value of the second forward differences. 
Recall that in the case of linear function, the second forward differences is zero, hence, the Trapezoidal 


rule gives exact value of the integral if the integrand is a linear function. 
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1 
Example 13.3.2 Using Trapezoidal rule compute the integral [ e® da, where the table for the values of y = 
0 


x 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 
1.00000 1.01005 1.04081 1.09417 1.17351 1.28402 1.43332 1.63231 1.89648 2.2479 2.71828 
Solution: Here, h = 0.1, n = 10, 


x 


2. . 
e” is given below: 


yo ae = aes ~ 1.85914, 


and 
9 
Sy; = 12.81257. 
i=l 


Thus, 
1 


pera = 0.1 x [1.85914 + 12.81257] = 1.467171 
0 


13.3.3. Simpson’s Rule 


If we are given odd number of tabular points,i.e. nm is even, then we can divide the given integral of 
integration in even number of sub-intervals [x24 v2442]. Note that for each of these sub-intervals, we have 
the three tabular points vox, Yox+1, e~+2 and so the integrand is replaced with a quadratic interpolating 
polynomial. Thus using the formula (13.3.3), we get, 


L2k42 


h 
f(x)dx = 3 [yor + 4yont1 + Yor+2 - 


In view of this, we have 


1 ees / f(a)dx 


a 2k 


[toa = freraes f reayte +4 
h 


= = [yo +4y1 + y2) + (yo + 4y3 + ys) +--+ (Yn—2 + 4Yn—1 + Yn)! 


aw 


= = [yo + 4y1 + 2y2 + 4y3 + 2ya +--+ + 2yn—2 + 4yn-1 + yn], 
3 


which gives the second quadrature formula as follows: 


h 
[ foae = 3 [Yo + yn) + 4 x (yr t yg +++: + Yorgi +++ + Yyn-1) 
+2 (yo ya toe oe tt + Yn) 
h n-1 n-2 
= z][Wotm+4x{ do mw) t2x> So wf}. G33) 
i=1, i—odd 1=2, i—even 


This is known as SIMPSON’S RULE. 


Remark 13.3.3 An estimate for the error E2 in numerical integration using the Simpson’s rule is given 


by 


Aty, (13.3.6) 


where A+y is the average value of the forth forward differences. 
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Example 13.3.4 Using the table for the values of y = e® asis given in Example|13.3.2| compute the integral 
1 
fe® de, by Simpson’s rule. Also estimate the error in its calculation and compare it with the error using 
0 


Trapezoidal rule. 


Solution: Here, h = 0.1, n = 10, thus we have odd number of nodal points. Further, 


9 
yo + yio = 1.0 + 2.71828 = 3.71828, oS) y= tystys+y7 + yo = 7.26845, 
i=1, i-odd 


and 
8 


S> y= yotyst yo t ys = 5.54412. 


i=2, i—even 


Thus, 
1 


[eve = > x [3.71828 + 4 x 7.268361 + 2 x 5.54412] = 1.46267733 
0 
To find the error estimates, we consider the forward difference table, which is given below: 

zy Yi Ay; A? y; Mey; A*y; 

0.0 1.00000 0.01005 0.02071 0.00189 0.00149 

0.1 1.01005 0.03076 0.02260 0.00338 0.00171 

0.2 1.04081 0.05336 0.02598 0.00519 0.00243 

0.3 1.09417 0.07934 0.03117 0.00762 0.00320 

0.4 1.17351 0.11051 0.3879 0.01090 0.00459 

0.5 1.28402 0.14930 0.04969 0.01549 0.00658 

0.6 1.43332 0.19899 0.06518 0.02207 0.00964 

0.7 1.63231 0.26417 0.08725 0.03171 

0.8 1.89648 0.35142 0.11896 

0.9 2.24790 0.47038 


1.0 2.71828 
Thus, error due to Trapezoidal rule is, 
1— 
B= -y 
— 1 ‘ 0.02071 + 0.02260 + 0.02598 + 0.03117 + 0.03879 + 0.04969 + 0.06518 + 0.08725 + 0.11896 
12 9 
= —0.004260463. 


Similarly, error due to Simpson's rule is, 


= 
yh Se 
2 180. 7 
_ __1_ 0.00149 + 0.00171 + 0.00243 + 0.00328 + 0.00459 + 0.00658 + 0.00964 
~ 180 7 


= —2.35873 x 107°. 


It shows that the error in numerical integration is much less by using Simpson's rule. 


1 
Example 13.3.5 Compute the integral | f(2)da, where the table for the values of y = f() is given below: 
0.05 
z 0.05 0.1 0.15 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 


y 0.0785 0.1564 0.2334 0.3090 0.4540 0.5878 0.7071 0.8090 0.8910 0.9511 0.9877 1.0000 


Solution: Note that here the points are not given to be equidistant, so as such we can not use any of 
the above two formulae. However, we notice that the tabular points 0.05, 0.10, 0, 15 and 0.20 are equidistant 
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and so are the tabular points 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1.0. Now we can divide the interval in two 
subinterval: [0.05, 0.2] and [0.2, 1.0]; thus, 


fo f sears f soy 
0.2 


0.05 0.05 
. The integrals then can be evaluated in each interval. We observe that the second set has odd number of 
points. Thus, the first integral is evaluated by using Trapezoidal rule and the second one by Simpson’s rule 
(of course, one could have used Trapezoidal rule in both the subintervals). 
For the first integral h = 0.05 and for the second one h = 0.1. Thus, 


0.2 
/ fide 008 & eae +0.1564 + 0.2334] ~ 0.029175, 
0.05 
1.0 01 
and / flwjde = =x (0.3090 + 1.0000) + 4 x (0.4540 + 0.7071 + 0.8910 + 0.9877) 
0.2 


+2 x (0.5878 + 0.8090 + 0.9511) 
= 0.6054667, 


which gives, 


if f(a)dx = 0.0291775 + 0.6054667 = 0.6346442 

0.05 
It may be mentioned here that in the above integral, f(a) = sin(aa#/2) and that the value of the integral 
is 0.6346526. It will be interesting for the reader to compute the two integrals using Trapezoidal rule and 
compare the values. 


b 
Exercise 13.3.6 1. Using Trapezoidal rule, compute the integral [ f(2)da, where the table for the values 


a 
of y = f(a) is given below. Also find an error estimate for the computed value. 


(ay. ? a=1 2 3 4 5 6 7 8 9 b=10 
y 0.09531 0.18232 0.26236 0.33647 0.40546 0.47000 0.53063 0.58779 0.64185 0.69314 
(b) x a=1.50 1.55 1.60 1.65 1.70 1.75 b=1.80 
y 0.40546 0.43825 0.47000 0.5077 0.53063 0.55962 0.58779 
(c) x a=1.0 1.5 2.0 2.5 3.0 b= 3.5 


1.1752 2.1293 3.6269 6.0502 10.0179 16.5426 


b 
2. Using Simpson's rule, compute the integral [ f(a)dx. Also get an error estimate of the computed 
a 
integral. 


(a) Use the table given in Exercise [13.3.6]1b| 
(b) x a=05 1.0 1.5 2.0 2.5 3.0 b=3.5 


y 0.493 0.946 1.325 1.605 1.778 1.849 1.833 
1.5 
3. Compute the integral f{ f(a)dx, where the table for the values of y = f(a) is given below: 
0 


x 00 05 O07 O9 11 #12 #13 #14 = «21:5 
y 0.00 039 O77 1.27 190 2.26 265 3.07 3.53 


238 CHAPTER 13. NUMERICAL DIFFERENTIATION AND INTEGRATION 


Chapter 14 


Appendix 


14.1 System of Linear Equations 


Theorem 14.1.1 (Existence and Non-existence) Consider a linear system Ax = b, where A is am x n 
matrix, and x, b are vectors with orders n x 1, and m x 1, respectively. Suppose rank (A) = r and 
rank({A b]) = 7a. Then exactly one of the following statement holds: 


1. if rg =7r <n, the set of solutions of the linear system is an infinite set and has the form 
{uo + kiuy + koug +--+ +kyp-pUn_p : ky ER, 1 <i<n—r}, 
where Uo, U1,.--,U,_, are n x 1 vectors satisfying Aup = b and Au; =O for 1 <i<n-—r. 
2. if ra =r =N, the solution set of the linear system has a unique n x 1 vector xo satisfying Axo = 0. 
3. If r <1q, the linear system has no solution. 


PRooF. Suppose [C’ d] is the row reduced echelon form of the augmented matrix [A _ b]. Then 
by Theorem [2.3.4] the solution set of the linear system [C' d] is same as the solution set of the linear 
system [A b]. So, the proof consists of understanding the solution set of the linear system Cx = d. 


1. Let r=T,_ <n. 


Then [Cd] has its first r rows as the non-zero rows. So, by Remark [2.4.5] the matrix C' = [c;,] 
has r leading columns. Let the leading columns be 1 < 21 < ig < +--+ < 1%, <n. Then we observe 
the following: 


(a) the entries q;, for 1 <1 <r are leading terms. That is, for 1 <1 <r, all entries in the ith 
column of C' is zero, except the entry c,;,. The entry cy, = 1; 
(b) corresponding is each leading column, we have r BASIC VARIABLES, 2j,,Vi,,---,Xi,3 


(c) the remaining n — r columns correspond to the n — r FREE VARIABLES (see Remark [2.4.5), 
Lj1,Ljg,+-+,%5,_,- 50, the free variables correspond to the columns 1 < j, < jg < ++: < 
JIn—r SN. 


For 1 <1 <r, consider the 1th row of [Cd]. The entry cj;, = 1 and is the leading term. Also, the 
first r rows of the augmented matrix [Cd] give rise to the linear equations 


nT 
Li, + Sata =d, for 1<Il<r. 
k=1 


920 
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These equations can be rewritten as 


n—r 
Li, = di — ye Cij,tj, =U, for 1<l<r. 
k=1 


Let y’ = (vi,,..-,%i,,%j,,---,2j,_,). Then the set of solutions consists of 

n—-Tr 

i” dy — YF Cin Xin 
UU k=1 
v; n—-Tr 

Ur A * 

y= = |dr— >) Crp, €5, 
Lj, k=l 
: Uj, 

Line 
Lin ry 
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(14.1.1) 


As «;, for 1 < s <n-—r are free variables, let us assign arbitrary constants kz € R to 2;,. That is, 


forl<s<n-—vr, «x;, =k,. Then the set of solutions is given by 


dy = > C1505 dy = > C1j ks 
s= s=l1 
y = |ar- Do Crjs2j.] = | ar — Lo Cry sks 
s=1 s=l1 
Li, ky 
Vin—r Kn—r 
dy Cry, C1jo Cljnar 
d, Crjy Crj2 Crjn—r 
0 —1 0 0 
= _ ky — ko — Kn—r 
0 0 -1 0 
0 0 0 0 
0 0 0 -1 
Let us write vo’ = (di, d2,...,d,,0,...,0)*. Also, for 1 <i<n-—r, let v; be the vector associated 


with k; in the above representation of the solution y. Observe the following: 
(a) if we assign k, =0, for l<s<n-—r, we get 
Cvp = Cy = d. 
(b) if we assign ky} = 1 and k, =0, for2<s<n-—r, we get 
d= Cy =C(vo + v1). 


So, using (14.1.2), we get Cv, = 0. 


in general, if we assign ki = 1 and k, =0, forl1<sA#t<n-—r, we get 


— 
le) 
Ww 


d= Cy =C(vo + vz). 


So, using (14.1.2), we get Cv, = 0. 


(14.1.2) 


(14.1.3) 


(14.1.4) 
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Note that a rearrangement of the entries of y will give us the solution vector x’ = (#1, 22,...,@n)’. 
Suppose that for 0 <i <n-—r, the vectors u,;’s are obtained by applying the same rearrangement 
to the entries of v,;’s which when applied to y gave x. Therefore, we have Cug = d and for 
1<i<n-—r, Cu; = 0. Now, using equivalence of the linear system Ax = b and Cx = d gives 


Aup =b andfor 1<i<n-—r, Au; =O. 
Thus, we have obtained the desired result for the case r =r, <n. 


2,.7r=%T =n, MSN. 


Here the first n rows of the row reduced echelon matrix [C' d] are the non-zero rows. Also, the 
number of columns in C' equals n = rank (A) = rank (C). So, by Remark[2.4.5] all the columns 
of C' are leading columns and all the variables x1, 22,...,%, are basic variables. Thus, the row 


I, 
0 O|— 


Therefore, the solution set of the linear system Cx = d is obtained using the equation [,x = d. 


reduced echelon form [C' d] of [A b] is given by 


iC d= 


This gives us, a solution as x9 = d. Also, by Theorem B-4.1]] the row reduced form of a given 
matrix is unique, the solution obtained above is the only solution. That is, the solution set consists 
of a single vector d. 


3.7 <q. 
As C has n columns, the row reduced echelon matrix [Cd] has n +1 columns. The condition, 
r <Tq implies that rg =r +1. We now observe the following: 
(a) as rank(C) =r, the (r + 1)th row of C consists of only zeros. 
(b) Whereas the condition rg = r+ 1 implies that the (r + 1)th row of the matrix [C_ d] is 


non-zero. 


Thus, the (r + 1)th row of [C d] is of the form (0,...,0,1). Or in other words, d,+, = 1. 


h equation is 


Thus, for the equivalent linear system Cx = d, the (r + 1)t 
Oa, +0a%2+---+02, = 1. 


This linear equation has no solution. Hence, in this case, the linear system Cx = d has no solution. 
Therefore, by Theorem [2.3.4] the linear system Ax = b has no solution. 


We now state a corollary whose proof is immediate from previous results. 


Corollary 14.1.2 Consider the linear system Ax = b. Then the two statements given below cannot hold 
together. 


1. The system Ax = b has a unique solution for every b. 


2. The system Ax = 0 has a non-trivial solution. 
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14.2 Determinant 
In this section, S denotes the set {1,2,...,n}. 


Definition 14.2.1 1. A function a : S—S is called a permutation on n elements if a is both one to one 


and onto. 


2. The set of all functions 0 : S—+S that are both one to one and onto will be denoted by S,,. That is, 
S,, is the set of all permutations of the set {1,2,...,n}. 


1 QQ... 
Example 14.2.2 1. In general, we represent a permutation o by 0 = - : 
o(1) o(2) +++ a(n) 


This representation of a permutation is called a TWO ROW NOTATION for o. 


2. For each positive integer n, S,, has a special permutation called the identity permutation, denoted Id, 
1 2. on 


such that Id,(i) =i for 1 <i<n. That is, Id, = - S 
eee nN 


3. Let n = 3. Then 
. i 9-3 to 3 io 3 
— Tt = TT) = T2 = 
: ae ae oe es fh. 36. De pe ae Oe ah eye 
12 3 1-293 t.6. 8 
= Ss = 14.2.5 
ee € 3 Sen é 1 Sen fC 2 ae ) 


Remark 14.2.3. 1. Let 0 € S,. Then o is determined if o(i) is known for i = 1,2,...,n. Aso is 
both one to one and onto, {o(1),a(2),...,a(n)} = S. So, there are n choices for a(1) (any element 
of S), n —1 choices for o(2) (any element of S different from o(1)), and so on. Hence, there are 
n(n — 1)(n — 2)---3-2-+1 =n! possible permutations. Thus, the number of elements in S,, is n}. 
That is, |S,| =n. 


2. Suppose that 0,7 € S,. Then both o and 7 are one to one and onto. So, their composition map 
a oT, defined by (o 0 T)(i) = o(r(i)), is also both one to one and onto. Hence, o 0 7 is also a 


permutation. That is, 70 T € Sy. 


3. Suppose o € S,. Then is both one to one and onto. Hence, the function o~! : S—+S defined 
by o—'(m) = ¢ if and only if o(@) =m for 1 < m <n, is well defined and indeed o~! is also both 


one to one and onto. Hence, for every element ¢ € Sy, o~! € Sp, and is the inverse of o. 


ny 


4. Observe that for any o € S,, the compositions ¢ 0 a~' = a~!oag = Idn. 


Proposition 14.2.4 Consider the set of all permutations S,,. Then the following holds: 


1. Fix an element 7 € S,. Then the sets {007:0 € S,} and {r00: 0 € S,} have exactly n! elements. 
Or equivalently, 
Sn ={Todg:0 €S8,}={ooT: a0 € Sp}. 


2. Spate ie eS, 


ProoFr. For the first part, we need to show that given any element a € S,, there exists elements 


B,y € Sn such that a =T0 8 =~yor. It can easily be verified that 8 =7~1 oa and y=aor™?. 


For the second part, note that for any o € S,, (o~!)~1 =a. Hence the result holds. 
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Definition 14.2.5 Let o € S,,. Then the number of inversions of a, denoted n(o), equals 
{(@,9): +<J, oft) > o(7) FI. 


Note that, for any o € S,, n(o) also equals 
SoHo) <o(é), for f=i+1,i+2,...,n}}. 
i=1 


Definition 14.2.6 A permutation o € S,, is called a transposition if there exists two positive integers m,r € 
{1,2,...,n} such that o(m) =r, o(r) =m and o(i) =i for l <iAmr<n. 


For the sake of convenience, a transposition o for which o(m) = r, o(r) = m and o(t) = i for 
1<iA#m,r <n will be denoted simply by ¢ = (mr) or (r m). Also, note that for any transposition 
a €S,,0 1 =o. That is, cog = Idn. 

; 3.4)\., a 
Example 14.2.7 = 1. The permutation 7 = ca is a transposition as r(1) = 3,7(3) = 


1 2 
3 2 
1, 7(2) = 2 and 7(4) = 4. Here note that 7 = (1 3) = (3 1). Also, check that 


n(T) = tq, 2), (1,3), (2, 3)}| = 3. 


2. Let T= hk ae es . Then check that 
42 3 519 8 7 6 


n(r) =3+14+14+14+0+34241=12. 


3. Let ¢,m and r be distinct element from {1,2,...,n}. Suppose tr = (mr) and o = (m £). Then 


(roa)(l) = t(o(é)) =Tt(m)=r, (rog)(m)= t(a(m)) =r(l)=2 


(roo)(r) = r(o(r)) =T(r) =m, and (roo)(i) =7(a(i)) =7(i) =7 ifi A mr. 
Thao, ror (mn ryotont) = ( ae " )=enotm 
Smit chack that wor = ( 3 : 7 i = “ “ : eh 


With the above definitions, we state and prove two important results. 


Theorem 14.2.8 For any o € S,, o can be written as composition (product) of transpositions. 


PROOF. We will prove the result by induction on n(c), the number of inversions of o. If n(o) = 0, then 
og =Id, = (1 2)0 (1 2). So, let the result be true for all o € S, with n(c) < k. 

For the next step of the induction, suppose that + € S, with n(r) = k +1. Choose the smallest 
positive number, say @, such that 


r(t) =%, fori =1,2,...,€-—1 and r(@) £2. 


As T is a permutation, there exists a positive number, say m, such that 7(¢) = m. Also, note that m > &. 


Define a transposition 0 by o = (€m). Then note that 


(67)@) =7; forth 2 4.48 
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So, the definition of “number of inversions” and m > @ implies that 


n(aoT) = Die o7)(4) < (70 7)(@); for j =i+1,i+2,...,n}| 


e 
~ DK(eo7)(4) < (eo 7H); for j =i+1,i+2,...,n}| 


+ S° {(oor)9) < (cor), for j=i+1,1+2,...,n}| 
i=l4+1 

= (a °o7)G) <(co7)(@), for j=1+1,2+2,...,n}| 
i=l4+1 

S- {r(j) <7(t), for 7 =i+1,i+2,...,n}| as m>Z, 
i=l41 


< (m-+ S© |{r({) <7), for j=i+1,1+2,...,n}| 
i=l4+1 


IA 


= nr). 


Thus, n(a oT) <k+1. Hence, by the induction hypothesis, the permutation o oT is a composition of 
transpositions. That is, there exist transpositions, say a;, 1 <i < t such that 


O0O0T=A10Q20:::O0Q}. 


Hence, T= 00Q10Q20---0a,; as 000 = Id, for any transposition o € S,. Therefore, by mathematical 


induction, the proof of the theorem is complete. 


Before coming to our next important result, we state and prove the following lemma. 


Lemma 14.2.9 Suppose there exist transpositions a;, 1 <7 < t such that 
Idyn = 01 0020°+- 0M, 
then t is even. 


PROOF. Observe that t 4 1 as the identity permutation is not a transposition. Hence, t > 2. If t = 2, 
we are done. So, let us assume that t > 3. We will prove the result by the method of mathematical 
induction. The result clearly holds for t = 2. Let the result be true for all expressions in which the 
number of transpositions t < k. Now, lett =k+1. 

Suppose ay = (mr). Note that the possible choices for the composition a1 0 a2 are 


(mr)o(mr)=Idn, (mr)o(m l) = (r £)o (rm), (mr)o(r 2) = (Lr) o(€m) and (mr)o (£8) = (£8) o(mrn), 


where ¢ and s are distinct elements of {1,2,...,n} and are different from m, r. In the first case, we 
can remove @ 1 0 Q@2 and obtain Id, = a3 0a40---0o ay. In this expression for identity, the number of 
transpositions is t— 2 = k—1<k. So, by mathematical induction, t— 2 is even and hence t is also even. 

In the other three cases, we replace the original expression for a1 © a2 by their counterparts on the 
right to obtain another expression for identity in terms of t = k+1 transpositions. But note that in the 
new expression for identity, the positive integer m doesn’t appear in the first transposition, but appears 
in the second transposition. We can continue the above process with the second and third transpositions. 
At this step, either the number of transpositions will reduce by 2 (giving us the result by mathematical 
induction) or the positive number m will get shifted to the third transposition. The continuation of this 


process will at some stage lead to an expression for identity in which the number of transpositions is 
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t — 2 = k—1 (which will give us the desired result by mathematical induction), or else we will have 
an expression in which the positive number m will get shifted to the right most transposition. In the 
later case, the positive integer m appears exactly once in the expression for identity and hence this 
expression does not fix m whereas for the identity permutation Id,(m) =m. So the later case leads us 
to a contradiction. 

Hence, the process will surely lead to an expression in which the number of transpositions at some 
stage is t —2 =k-—1. Therefore, by mathematical induction, the proof of the lemma is complete. 


Theorem 14.2.10 Let a € S,. Suppose there exist transpositions 71, 72,...,T and 01, 02,...,@¢ such that 
A@=T710°T20°*:OTp = 010020°':008 
then either & and @ are both even or both odd. 


PRooF. Observe that the condition 7; 0 T20-+:0 7, = 01,0020:::oo0¢ and aoa = Id, for any 


transposition 0 € S,, implies that 
Idn = T1072 0+++ OTRO OLOOf_-10+++ 004. 


Hence by Lemma[I4.2.9] k + @ is even. Hence, either k and £ are both even or both odd. Thus the result 


follows. 


Definition 14.2.11 A permutation o € S,, is called an even permutation if o can be written as a composition 
(product) of an even number of transpositions. A permutation o € S,, is called an odd permutation if o can 


be written as a composition (product) of an odd number of transpositions. 


Remark 14.2.12 Observe that if 0 and 7 are both even or both odd permutations, then the permu- 
tations 0 0 T and T0 0 are both even. Whereas if one of them is odd and the other even then the 
permutations 00 T and Tt 0@ are both odd. We use this to define a function on Sy, called the sign of a 


permutation, as follows: 


Definition 14.2.13 Let sgn: S,—>{1,—1} be a function defined by 


(c) 1 if o is an even permutation 
sgn(o) = 
. —1_ if o is an odd permutation 


Example 14.2.14 1. The identity permutation, Id, is an even permutation whereas every transposition 
is an odd permutation. Thus, sgn(Id,,) = 1 and for any transposition o € S,,, sgn(o) = —1. 


2. Using Remark [14.2.12] sgn(o o rT) = sgn(c) - sgn(r) for any two permutations 0,7 € Sn. 


We are now ready to define determinant of a square matrix A. 


Definition 14.2.15 Let A = [a,;] be an n x n matrix with entries from F. The determinant of A, denoted 
det(A), is defined as 


det(A) = S- SEN(7)Q19(1)@20(2) «+» Ino(n) _ >, sen() | | aiotiy- 
i=1 


oESn TESn, 


Remark 14.2.16 = 1. Observe that det(A) is a scalar quantity. The expression for det(A) seems 
complicated at the first glance. But this expression is very helpful in proving the results related 


with “properties of determinant”. 
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2. If A= [aj] is a3 x 3 matrix, then using (14.2.5), 


det (A) 


3 
S- sgn(c) Il Qio(i) 
i=1 


cESy, 


3 3 3 
=  sgn(71) Il Giz, (4) + Sgn(T2) II Airy (i) + sgn(T3) Il Gits(i) + 
i=l i=l j=l 
3 3 3 
sgn(Ta) II Qiz,(s) + Sgn(T5) Il Girs(s) + Sgn(Té) II Qir6(i) 


i=l t=1 i=l 


11422033 — 11023432 — 412421433 + 412423031 + 413421432 — 213422031. 


Observe that this expression for det(A) for a 3 x 3 matrix A is same as that given in (2.8.1). 


14.3. Properties of Determinant 


Theorem 14.3.1 (Properties of Determinant) Let A = [a;;] be an m x n matrix. Then 


Li 


10. 


11. 


. Let B = [bj;] and C = [c;;] be two matrices which differ from the matrix A = [a,j] only in the mt 


if B is obtained from A by interchanging two rows, then 
det(B) = — det(A). 


. if B is obtained from A by multiplying a row by c then 


det(B) = cdet(A). 


. if all the elements of one row is 0 then det(A) = 0. 


. if A is a square matrix having two rows equal then det(A) = 0. 


h 
row for some m. If Cmj = @mj + bm; for 1 < j <n then det(C) = det(A) + det(B). 


. if B is obtained from A by replacing the @th row by itself plus & times the mth row, for 2 4 m then 


det(B) = det(A). 


. if A is a triangular matrix then det(A) = a11@22--+Gnn, the product of the diagonal elements. 


. If Eis an elementary matrix of order n then det(£A) = det(£) det(A). 


. Ais invertible if and only if det(A) 4 0. 


If Bis ann x n matrix then det(AB) = det(A) det(B). 


det(A) = det(A‘), where recall that A’ is the transpose of the matrix A. 


Proor. Proof of Part 1. Suppose B = [b;;] is obtained from A = [a,;] by the interchange of the gth 


and m 


th row. Then be; = @mj, bmj = ae; for 1 <j <n and bj; = ay for 1 <iflom<n, l<j<n. 
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Let + = (€ m) be a transposition. Then by Proposition [14.2.4] S, = {cor : o € S,}. Hence by the 
definition of determinant and Example |14.2.14]2) we have 


n 


det(B) = S© sgn(o)] [bic = S5 sen(oor) [dior 
t=1 


cESy, ooTESn i=l 


= > sen(T) . sgn(c) D1 (c0r)(1)02(c0r)(2) ei be(cor)(£) aa Ongar Ga) one UnaarGe) 


aoTESn, 


=  sgn(r) S- seta) Bieta) * Sae(2) **? Opeian)*?* Ome(th? ?* Onatn) 
cESy, 


— ( S- sgn(c) @1o(1) * 420(2)°** Gma(m)** * Mea (e) °°" ent) as sgn(T) =-1 
FESn, 


= -—det(A). 


Proof of Part 2. Suppose that B = [};;] is obtained by multiplying the mth row of A by c#0. Then 
bmj = CAmj and bij = ay for 1<iAm<n, 1<j<n. Then 


det(B) = S- sen(o)b15(1)b20(2) aa Dimotan) = Onetr) 
oESyn 


= sen(o)a15(1)420(2) "°° CAama(m) *** Ino(n) 
aTESy, 


= ¢ S- sen(7)@15(1)420(2) *** Ama(m)*** ana(n) 
aESn 
= cdet(A). 


Proof of Part 3. Note that det(A) = 5? sgn(o)aj5(1)@20(2) «++ @no(n) SO, each term in the expression 
oESn 
for determinant, contains one entry from each row. Hence, from the condition that A has a row consisting 


of all zeros, the value of each term is 0. Thus, det(A) = 0. 


Proof of Part 4. Suppose that the éth and mt} row of A are equal. Let B be the matrix obtained 
from A by interchanging the th and m*4 rows. Then by the first part, det(B) = — det(A). But the 
assumption implies that B = A. Hence, det(B) = det(A). So, we have det(B) = —det(A) = det(A). 
Hence, det(A) = 0. 


Proof of Part 5. By definition and the given assumption, we have 


det(C) = S- sen()C16(1)€20(2) ***Cmoa(m)*** Cno(n) 
cESn 


= S- Sgn(7)C15(1)C20(2) ak (OrnxGe) + Grno(rn)) *** Cno(n) 
oESn 


S- sen(o)b15(1)b20(2) ca Oraeten) wn bno(n) 
TESy 


+ > sen(7)@15(1)420(2) *** Amo(m)*** ano(n) 
aESy, 
det(B) + det(A). 


Proof of Part 6. Suppose that B = [b,;] is obtained from A by replacing the ¢th row by itself plus k 
times the mth row, for A m. Then be; = agj +k amj and bj = ay for 1 <ifAm<n, 1l<j<n. 
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Then 


det(B) = S- sen(o)b15(1)b20(2) aa bec (e) — Duet a Destiny 
oESn 


= S- sen(7)41¢(1)420(2) —s (Q¢6(2) P kamo(m)) ***Amoa(m) *** ana(n) 
aESn 


= S- sen(7)41¢(1)420(2) “+ * Aga()*** Amo(m) *** ana(n) 
aESn 


+k S- sen(o)a1¢9(1)420(2) “**Amoa(m)*** Gma(m)*** Gno(n) 
7ESy 


= S- sen(o)19(1)420(2) “+ * Aga()*** Amo(m) *** ana(n) use Part 
aESn 


= det(A). 


Proof of Part 7. First let us assume that A is an upper triangular matrix. Observe that if 0 € Sp, 
is different from the identity permutation then n(c) > 1. So, for every o 4 Id, € Sy, there exists a 
positive integer m, 1 <m <n-—1 (depending on c) such that m > o(m). As A is an upper triangular 
matrix, Gmo(m) = 0 for each o(f Id,) € Sp. Hence the result follows. 


A similar reasoning holds true, in case A is a lower triangular matrix. 


Proof of Part 8. Let I, be the identity matrix of order n. Then using Part [7] det(,) = 1. Also, 
recalling the notations for the elementary matrices given in Remark we have det(£;;) = —1, 
(using Part[I) det(£;(c)) =c (using Part 2) and det(£;;(k) = 1 (using Part [6). Again using Parts [I] 2] 
and [6] we get det(#A) = det(£) det(A). 
Proof of Part 9. Suppose A is invertible. Then by Theorem [2.7.7] A is a product of elementary 
matrices. That is, there exist elementary matrices £), F2,...,E, such that A = FE 2---E,. Nowa 
repeated application of Part [8]implies that det(A) = det(£;) det(E£2)---det(E,). But det(£;) 4 0 for 
1<i<k. Hence, det(A) 40. 

Now assume that det(A) 4 0. We show that A is invertible. On the contrary, assume that A is 
not invertible. Then by Theorem [2.7.7] the matrix A is not of full rank. That is there exists a positive 
integer r < n such that rank(A) = r. So, there exist elementary matrices EF), E2,...,E% such that 


B 
Ey E2-+- Ep A= 4 | . Therefore, by Part Bland a repeated application of Part [8] 


det (£1) det(E2)---det(£;,) det(A) = det( FE, E2--- E, A) = det (| \) = 


But det(EZ;) 4 0 for 1 <i< k. Hence, det(A) = 0. This contradicts our assumption that det(A) 4 0. 
Hence our assumption is false and therefore A is invertible. 
Proof of Part 10. Suppose A is not invertible. Then by Part[9] det(A) = 0. Also, the product matrix 
AB is also not invertible. So, again by Part[9] det(AB) = 0. Thus, det(AB) = det(A) det(B). 

Now suppose that A is invertible. Then by Theorem [2.7.7] A is a product of elementary matrices. 
That is, there exist elementary matrices Ey, £2,...,E, such that A = EF, EB 2---E,. Now a repeated 
application of Part [8]implies that 


det(AB) = det(f, E2--- E,B) = det(£;) det(E2) ---det(E;,) det(B) 
= det(E£,E,---E;,) det(B) = det(A) det(B). 
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Proof of Part 11. Let B = [b;;] = A‘. Then b;; = a,j; for 1 < i,j <n. By Proposition[14.2.4] we know 


that S, = {o07': o € S,}. Also sgn(c) = sgn(o—'). Hence, 


det(B) = S- sen(o)b19(1)b20(2) * ++ One(n) 
cESn 


7 S- sgn(o~*)bg-1(1) 1 bg-1(2) 27° bg-2(n) n 


cESy, 


a > sgn(a~*)ayg~1(1)b29-1(2) . Bno-1(n) 


cESn 


= det(A). 


Remark 14.3.2 1. The result that det(A) = det(A‘) implies that in the statements made in Theo- 


rem|14.3.1| where ever the word “row” appears it can be replaced by “column”. 


2. Let A = [a;;| be a matrix satisfying a1; = 1 and a,; = 0 for2 <j <n. Let B be the submatrix 


of A obtained by removing the first row and the first column. Then it can be easily shown that 


det(A) = det(B). The reason being is as follows: 


for every o € Sy, with o(1) = 1 is equivalent to saying that o is a permutation of the elements 


{2,3,...,n}. That is, o € Sp_1. Hence, 


det (A) = x sgn(7)15(1)420(2) “* Ano(n) = S- 


sgn(7)azo(2) *** 


oESpy o€Sp,0(1)=1 


y) sgn(o)by5(1) eas ery = det(B). 


cESn_1 


Qno(n) 


We are now ready to relate this definition of determinant with the one given in Definition [2.8.2] 


Theorem 14.3.3 Let A be an n x n matrix. Then det(A) = >> (—1)'*a1,; det(A(1|j)), where recall that 


j=l 


A(1|j) is the submatrix of A obtained by removing the 15 row and the j* 


Proor. For 1 <j <n, define two matrices 


0 0 aij 0 aij 
421 422 ‘*' Aaj c** Gan a5 
B= and C; = 
J J 
a. a see a. . see a a. . 
ni n2 nj MN nxn nj 


Then by Theorem [14.3.1] 
det(A) = S° det(B;). 


gat 


h column. 

0 0 
a21 422 
An aAn2 


(14.3.6) 


We now compute det(B,;) for 1 < 7 <n. Note that the matrix B; can be transformed into C; by j — 1 


interchanges of columns done in the following manner: 


first interchange the 15 and 2nd column, then interchange the gnd and 3'¢ column and so on (the last 


process consists of interchanging the (j — 1)th column with the jth column. Then by Remark [14.3.2] 
and Parts [I] and [2] of Theorem [4.3.1] we have det(B;) = a1;(—1)’~' det(C;). Therefore by (14.3.6), 


n 


det(A) = $0 (-1)?~1a1; det(A(1|g)) = $5 (-1)?** a1; det(A(1J)). 


j=l j=l 
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14.4 Dimension of M+ N 


Theorem 14.4.1 Let V(F) be a finite dimensional vector space and let M and N be two subspaces of V. 
Then 


dim(M) + dim(V) = dim(M + N)+dim(MnN). (14.4.7) 
Proor. Since MN is a vector subspace of V, consider a basis B; = {uj,ue,...,ug} of MON. 
As, MN is a subspace of the vector spaces M and N, we extend the basis 6, to form a basis 
Bu = {u1,U2,..-, Uz, Vi,-.-, Vr} of M and also a basis By = {u1, U2,...,Ux,Wi,...,Ws} of N. 
We now proceed to prove that that the set By = {uj, ue,...,Uk, Wi,---,Ws, V1, V2,---, Vr} is a basis 
of M+N. 


To do this, we show that 
1. the set Bz is linearly independent subset of V, and 
2. L(B2)=M+N. 
The second part can be easily verified. To prove the first part, we consider the linear system of equations 
QU, ++++ + Q,pUR + Biwi +--+ + BsWs +71V1 +°°+ + IrVr = 0. (14.4.8) 


This system can be rewritten as 


au, +--+ + aug + Biwi +:+++ BsWs = —(M1V1 +++ + YrVr)- 
The vector v = —(y1vi +--+: +9rv-) € M, as vi,...,v-r € By. But we also have v = ayu, +--+: + 
Qpuy + Byw, +--:-+8,w, € N as the vectors uj, U2,...,Ug, Wi,...,Ws € By. Hence, vE MON and 
therefore, there exists scalars 61,...,6,% such that v = 6,u, + dgU2 +--+: + d,Ug. 


Substituting this representation of v in Equation ([4.4.8), we get 
(a1 — 61)u1 + +++ + (ax — de )UR + Biwi +--+ + Bsws = 0. 


But then, the vectors u;, U2,...,Uz,W1,-.., Ws are linearly independent as they form a basis. Therefore, 


by the definition of linear independence, we get 
a,—6;=0, for 1<i<k and 6; =0 for l1<j<s. 
Thus the linear system of Equations reduces to 
ayuy +++ + aRUR +Y1V1 +-+° + YrVr = 0. 
The only solution for this linear system is 
aj =0, for 1<i<k and y,;=0 for l1<j<r. 


Thus we see that the linear system of Equations (14.4.8) has no non-zero solution. And therefore, 
the vectors are linearly independent. 

Hence, the set Bo is a basis of M + N. We now count the vectors in the sets 6;,B2,By, and By to 
get the required result. 
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14.5 Proof of Rank-Nullity Theorem 


Theorem 14.5.1 Let T: V—>W bea linear transformation and {w1, u2,...,Un} be a basis of V . Then 
1. R(T) = L(T(u1),T(u2),--.,P (un): 


2. T is one-one = > N(T) = {0} is the zero subspace of V <=> {T(uj):1 <i <n} is a basis of 
R(T). 


3. If V is finite dimensional vector space then dim(R(T)) < dim(V). The equality holds if and only if 
N(T) = {0}. 


Proor. Part 1) can be easily proved. For 2), let T be one-one. Suppose u € NV(T). This means that 
T(u) = 0 = T(0). But then T is one-one implies that u = 0. If V(T) = {0} then T(u) = T(v) —> 
T(u—v) =0 implies that u = v. Hence, T is one-one. 


The other parts can be similarly proved. Part 3) follows from the previous two parts. 


The proof of the next theorem is immediate from the fact that T(0) = 0 and the definition of linear 
independence/dependence. 


Theorem 14.5.2 Let T : V—+W be a linear transformation. If {T(u1),T(u2),...,T(un)} is linearly 
independent in R(T) then {u1, u2,...,Un} C V is linearly independent. 


Theorem 14.5.3 (Rank Nullity Theorem) Let T : V—+W be a linear transformation and V be a finite 


dimensional vector space. Then 
dim( Range(7’)) + dim(WV(T)) = dim(V), 
or p(T) +U(T) =n. 


Proor. Let dim(V) = n and dim(V(T)) = r. Suppose {u1, u2,...,ur} is a basis of A(T). Since 


{ui, u2,...,U,} is a linearly independent set in V, we can extend it to form a basis of V. Now there exists 
vectors {U;41, Ur+2,---,Un} such that the set {u1,...,Ur,Ur4i,---;Un} is a basis of V. Therefore, 
Range (T) = L(T(u1),T(w2),---sT(un)) 


=) POOL ei). Pas hens FO) 
= L(T(ur+1),T(ur+2),---,T(un)) 


which is equivalent to showing that Range (T) is the span of {T(u,r+1), T(ur+2),...,T(un)}. 
We now prove that the set {T(uy+1), T(tr+42),.--,(un)} is a linearly independent set. Suppose the 


set is linearly dependent. Then, there exists scalars, @,+1, @r+2,---,;Q@n, not all zero such that 


p41 T (Urgi) + OrpeT (Urpe) +++: + anT (un) = 0. 


Or T (Op 41Urqi + Orp2Ur42 +: +++ Q@nUn) = 0 which in turn implies a,41Up41 + Ar42Urz2 +++: +AnUn € 
N(T) = L(u,..-,Ur). So, there exists scalars a;, 1 <i <r such that 


Ar41Ur+1 + Ar+QUr+2 tet + AnUn = AU + AQUA +++ + ApUr. 


That is, 


QyUy +++) + OpUp — App1Ury1 — +++ — AnUn = 0. 


Thus a; = 0 for 1 <i< nas {u1, u2,...,Un} is a basis of V. In other words, we have shown that the set 


{T(ur41), T(Ur+2),-..,T(un)} is a basis of Range (IT). Now, the required result follows. 


we now state another important implication of the Rank-nullity theorem. 
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Corollary 14.5.4 Let T : V—+V bea linear transformation on a finite dimensional vector space V. Then 
T is one-one <=> T is onto <=> T has an inverse. 


Proor. Let dim(V) = n and let T be one-one. Then dim(WV(T)) = 0. Hence, by the rank-nullity 
Theorem[[4.5.3]dim( Range (T)) = n = dim(V). Also, Range(T)) is a subspace of V. Hence, Range(T) = 
V. That is, T is onto. 

Suppose T is onto. Then Range(T’) = V. Hence, dim( Range (T)) = n. But then by the rank-nullity 
Theorem [14.5.3] dim(N/(T)) = 0. That is, T is one-one. 

Now we can assume that T' is one-one and onto. Hence, for every vector u in the range, there is a 
unique vectors v in the domain such that T(v) = u. Therefore, for every u in the range, we define 


Tt qQ=v 


That is, T has an inverse. 


Let us now assume that JT has an inverse. Then it is clear that T is one-one and onto. 


14.6 Condition for Exactness 


Let D be a region in xy-plane and let M and N be real valued functions defined on D. Consider an 
equation 
M(x, y(x))dx + N(x, y(x))dy =0, (x, y(ax)) € D. (14.6.9) 


Definition 14.6.1 (Exact Equation) The Equation (14.6.9) is called Exact if there exists a real valued twice 
continuously differentiable function f such that 


of Of _ 
5, ~ M and ao 


Theorem 14.6.2 Let M and N be “smooth” in a region D. The equation (14.6.9) is exact if and only if 
OM ON 


=— = =. 14.6.10 
Oy Ox ( ) 
Proor. Let Equation (14.6.9) be exact. Then ra is a “smooth” function f (defined on D) such that 
; 2 2 
M= gt and N = oe So, on te ot a and so Equation (14.6.10) holds. 


Conversely, let Equation (14.6.10) hold. We now show that Equation (14.6.10) is exact. Define 
G(a,y) on D by 


G(e,) = / M(x, y)de + 9(v) 


where g is any arbitrary smooth function. Then 2 = = M(za,y) which shows that 


eg AC ce ce ce ec 
Ox Oy Oy Ox Oy Oa’ 
So we (N — $2) =0 or N—- $F is independent of «. Let oly) = N — $2 or N = (y) + $2. Now 


Ox 
M(x,y) +2 = set [Se +o | 4 
= [Et alta (foma) 
= Leayey+Z (f ow) where y = y(t) 


= (F(e,9)) where f(2,y) = Gla.) + f alu 
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